AL-Base update 9/2021
We recently updated and relaunched our database of antibody light chain sequences, AL-Base. This unique collection of curated protein sequences is a useful resource that has been used by many groups around the world. The database and site were originally built by Kip Bodi in the Amyloidosis Center in 2008, but had been due for an update. We had been collecting new sequences and trying to work out how to update the site when its server went offline earlier this year. Unfortunately we were unable to resurrect the site and had to rebuild it from scratch (although we had at least managed to back up the data).
We reached out to the School of Public Health’s Biostatistics and Epidemiology Data Analytics Center (BEDAC), and Dr. Axin Hua has built a new site, on a modern framework, that has restored AL-Base’s core functionality. The site has a new address at https://wwwapp.bumc.bu.edu/BEDAC_ALBase but the old URL (albase.bumc.bu.edu) will redirect there. The site still looks very similar but the underlying web application is completely new. For example, downloading sequences should now be much easier than in the previous version. There are still some things to rebuild, but the database is usable again. Looking ahead, we aim to extend the database and add new functionality over the coming months.
There are some changes to the sequences available. We have removed a number of duplicated or redundant sequences, so each sequence is now unique. This means that there are fewer sequences in total, but analyzing these sequences should be easier. We are working on a way to handle multiple instances of the same sequence, such as when there are both nucleotide and protein sequences in various databases. In the meantime, we have tried to be conservative and excluded all copies of a sequence if there is any ambiguity about which is the “correct” one.
We have also scoured the literature for new light chain sequences associated with amyloidosis or related plasma cell disorders, adding 177 new sequences. In total, AL-Base now contains 556 unique AL-associated light chains and 242 sequences from other plasma cell disorders.
Please let us know if you use the new site and have any suggestions for improvements or features that you would like to see. If you use AL-Base in your work, please cite the original paper and acknowledge the support of the NIH grant, HL68705.