Notes on how to upgrade a legacy Ruby application Feb 21 2021 Latest Update: Mar 6 2021
Upgrading any legacy application is a headache. You are trying to pay a technical debt of years in a couple of weeks or months. It isn't easy, but it is also the best time to get things right for the next time you need to do an upgrade. In this short post, you'll find some of my notes on upgrading a legacy Ruby application.
Let's start with a step that sometimes we ignore, but it's the most important, taking good notes.
Table of Contents
- Take good notes
- Creating a clone dev environment
- Dealing with old dependencies
- Updating the
- Make use of version control
- Backup your data
- Rails application specifics
- Test automation
- Final thoughts
Take good notes
Before moving to build the development environment, make sure you are raking notes and writing everything. You might need to restore or understand the decisions later.
I keep my notes in plain text, but you can store them however you prefer. The important part is that you take notes. Write the commands you use and also the reasoning behind them. I write my notes at the same time I'm running the comands. That doesn't mean that they are chronological because sometimes I make mistakes. I don't want anyone following my notes to execute every single command I wrote and make the same errors. My notes are edited to reflect the most logical order of steps to achieve the desired outcome.
I do sometimes leave failed attempts just as a warning for the future reader. Find a style that works for you, and think of what information you would like to have if you were doing it the first time.
Creating a clone dev environment
To begin with, we need an environment as close as possible to what the application needs. We can't test how our code will behave on a new system if we can't play with making minor adjustments.
I prefer to start with a brand new Virtual Machine (VM). I don't get a specialised already-set-up VM that contains all dependencies and everything I could need. Why? Because I've found that the more libraries and code involved, the more chances that something can go wrong. You won't know if a bug causes an error in your code or incompatibility with a library or gem that you don't even require. So start with a bare VM on a modern Operating System and go from there.
I use Vagrant to set up my development VMs, and I usually chose the VMs set up by Chef's Bento Project. You can find the VMs here.
The idea is to have a VM running your code as close to the server. But with the added benefit of taking snapshots and rollback easily if the change you made didn't work or a better option presents itself.
Let's see an example of when will this be helpful. Imagine I had an application that ran on ruby
2.2, but it requires
libssl1.0-dev. If I use a VM that already has
libssl1.1 or more recent, it'll break my installation. I will have to uninstall all the pre-installed libraries and dependencies, which is extra work. So let's avoid that and start with the cleanest install possible.
Creating the infrastructure using Docker
The reason I start with VM is that I have no idea what I'll need. I sometimes spend a lot of time building a snowflake server until I understand the specific requirements. Once I know what I'm working with, I would like to have a way to reproduce the Server or roll back some change to try a different library or piece of code. This is where Docker comes in for me. I can use the Dockerfile to document how I built the server and which specific libraries I chose.
Once I have that, it is easy to modify the server to accommodate any new requirement and have it documented.
Creating the infrastructure using Ansible
Another option that I have been exploring is using Ansible playbooks as my Infrastructure as Code and documentation.
After I've understood all the requirements and the infrastructure I need to build, I would create Ansible playbooks.
For either Ansible or Docker, I'll add them to a version control system. That way, I can easily roll back, branch-out, or make any changes needed with confidence that I can revisit several checkpoints if needed.
Let's now talk about a big part of legacy systems, deprecated dependencies.
Dealing with old dependencies
Depending on how old your dependencies are, you have three options:
- Install from an old package repository
- Install from
- Install from source
I'll explain how to do it on Ubuntu, but your distro should have something similar.
Installing form an old package repository
Let's use the
libssl1.0-dev library as an example. Let's say I'm running the latest Ubuntu LTS (20.04 at the time of writing). It won't contain the required library, so what to do? The first step is to search for it on Ubuntu Package search. Check the website on which release included the package. If you are lucky and it is still on a repository you can access, add that repository to your
1 sudo vi /etc/apt/sources.list
Let's imagine we found the package in the Ubuntu Package search results for libssl1.0.0. And we can get the package from the repository for the
bionic release of Ubuntu. We could add the following repository to our source list:
1 2 ## Adding this repo for libssl-1.0-dev deb http://security.ubuntu.com/ubuntu bionic-security main
We can now verify it is available on our system for installation:
1 2 3 4 5 6 7 8 $ sudo apt-get update $ apt-cache policy libssl1.0-dev libssl1.0-dev: Installed: (none) Candidate: 1.0.2n-1ubuntu5.5 Version table: 1.0.2n-1ubuntu5.5 500 500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages
Yey! now we can install it.
Installing form a
What happens if we can't find a repository that still has the package? The solution is to search for the specific package. If it existed before, you'd probably find it on Launchpad. It hosts many packages, even old ones. For example, imagine we would need to install
postgresql-server-dev-9.5. You can search for it on your favourite search engine and get the package on launchpad.net.
You'll see that you can find all the
.deb packages there. That will be the solution. You can now download it and install it from the
1 $ sudo dpkg -i postgresql-server-dev-9.5_9.5.13-0ubuntu0.16.04_amd64.deb
If all else fails, use the source.
Installing from source
If you can't find a package repository with the version you need or find a package in any other place, you'll have to compile from the source. There are not many general instructions I can give you here. You'll have to read the documentation for the library or software you need to install. But it generally goes like this:
- Read the documentation to know which configuration settings work for you.
./configwith your settings
make && make install
I can't offer much help in this case because each library or software you need to install will be different.
Anyhow, once you have all your dependencies ready, it is time to check the
If your program does not use a
Gemfile and everything was installed manually just using
gem install, check which gems were installed on the original server:
1 $ gem list
And create a
Gemfile with those gems and versions. Let's now assume that we are all working with a decent
Gemfile. You don't want to run a
bundle update and have everything break. You won't even know where to start with so many gems being updated at the same time. So let's make a plan.
NOTE: before proceeding, make sure you are using a version control mechanism for your code.
When we need to upgrade a legacy project, it is because the client is faced with a problem they can't ignore. Probably the server they have been working on for years reached its end-of-life (EOL), or they can't migrate to the latest version of Rails or any other reason that is forcing the upgrade. So we part from there.
We use the requirements to guide us as to how we are going to approach the upgrade. We don't want to, in one go, run "bundle update" and update every single gem. Too many things will change simultaneously, and it'll be harder to try to fix everything at the same time. We'll do incremental updates. Let's say you are running a Rails 4.2 application using ruby 2.2.2. And our new server requires us to ruby 2.5.0. That will be our goal, and with that goal in mind, we go step by step.
For this case, because we need to work with multiple version of ruby, we would install a ruby version manager. I use rbenv with the plugin rbenv-gemset. That way, we can easily switch between one ruby version and also create different gemsets if needed.
Pinning Gem versions
Gemfile.lock will be generated after we use
bundle install, but the problem now is that we don't want to jump too many versions at one time. Why should we do incremental upgrades? Let me give you an example. Imagine we are working with a gem called
mygem. This fictitious gem has been in a regular cadence of changes and improvements. Imagine you started using it on
0.7.0 (As a rule of thumb, you shouldn't use
0.x versions of any gem on production). The current version is
3.14.15. If your
Gemfile only had it as a reference without pinning it to any version when you run
bundle update mygem, you'll jump to the latest version. We would have skipped two major versions that will very likely break your code.
So whenever you upgrade your gems, if you run
bundle update on a specific gem, verify that the dependency gems don't jump major versions that break your code. Pin the dependent gems. You should be in control of how fast your gems are moving. Always run a diff on your
Gemfile.lock after any
bundle operation. You'll be running the following command multiple times.
1 $ git diff Gemfile.lock
The developer community has different thoughts on version pinning. Some think that you should always run on the latest. For our case, it is different. We want to be in control of the upgrade process. After your application is up to date, you can decide if that is a strategy that works for you.
Make use of rubygems.org
Following the incremental upgrades plan, use the Ruby Gems site to search for your gem and look at the dependencies. The site has a very useful interface where you can see the dependencies and navigate between versions.
Make use of version control
I've mentioned before that having a version control system is essential. It allows you to experiment and roll back the changes you made with ease. It also allows you to create branches for changes you make to your
Gemfile, and once you feel comfortable with the upgrade, merge it to your main branch. I've sometimes used a separate branch to test if a gem replacement is feasible. Sometimes a gem is abandoned, and your only choice is to replace it with a new one or fork it and maintain it.
Use branches for those cases and experiment on ways to make your future upgrades more manageable.
Backup your data
Not much controversy here. This section shouldn't even exist. It should be a given, but let's play it on the safe side.
- Create a backup
- Test that the backup works (not by replacing your current database but restoring it somewhere else and making sure it is complete)
Rails application specifics
Make sure to run
rails app:update and check the
diff for each file before saying yes. If you had all your files in version control, which you should have had, you can accept all changes and do the diff with your favourite diff tool, but I think it is important to look at the diff before accepting the changes.
If you don't have tests on your codebase, today is an excellent time to start. Add basic unit tests; make sure that every critical operation has a way of validating that it is still working. I would suggest:
- Start with basic tests for CRUD operation of models.
- Add critical validations, e.g. your balance should never be negative, or every book should have an author.
- Create happy path validation. This means that when everything is as expected, your tests should pass.
After having that, you can use some code coverage tools to know which part of your codebase is not being handled by your tests. Depending on your time restrictions, talk with your teammates and decide if now is the best time to add more tests and code coverage. Without test automation, there is no simple way to validate your changes, and you don't want your clients to be your QA team. Many of them will quit using your software before filling in a report on what is not working.
Upgrading a legacy project is challenging, so do yourself a favour and make it as painless as possible. I can't stress enough the importance of good notes and version control. I've been saved by them a lot of times. It is also important to remember that the application is important enough to merit an upgrade, so ensure that everything is working as intended before replacing it with the new version. Please make sure you have backups of everything and that those backups work. Try to restore from them before assuming they are ok.
If possible, try to install and run the legacy project on a VM or a container. That way, you make sure that even if the current server gets decommissioned, you can make it accessible even if it is in a limited capacity while you do the upgrade.
Ok, that's it for this short post. I hope it is helpful. If you have any other suggestions, let me know. I'm always keen on learning new tips and tricks.