Biologics Development: How do you Solve a Problem Like Data Management?

Data management solutions exist in early research and production manufacturing, but process development groups need a blend of flexibility and structured data not offered by traditional electronic laboratory notebooks (ELNs), laboratory information management systems (LIMS) and manufacturing execution systems (MES).

The development data problem

The lab today has many informatics solutions at its fingertips. Traditional electronic laboratory notebooks (ELNs) allow researchers to move away from paper approaches. Manufacturing execution systems (MES) are a great fit on the other end of the spectrum too, where processes are consistent and data is captured in a rigid, structured manner. Laboratory information management systems (LIMS) take a sample-centric view of the world, which works well in analytical services and QC laboratories. But process development requires flexibility to make process changes and try new ideas, as well as the structure to allow for consistent data capture and analysis.

As a development scientist turned software consultant, I’ve witnessed this from two very different angles. It’s the reason that Excel is still the standard for the majority of process development scientists. Combined with statistical packages, these tools enable data management on an individual level. The issue arises when you want to combine datasets from multiple scientists, either within the same group or across upstream and downstream.

When multiple datasets are involved, individual tools like Excel become cumbersome. Traditional databases have been employed with limited success but they lack the functionality and scalability required to be anything more than a project data repository. Process development needs a blend of structured and unstructured data, with a heavy focus on analysis and traceability, and the industry is looking to next generation informatics to solve the problem.

Putting data in

A data management system is only as good as the data that resides in it. Ease of use is paramount, otherwise scientists will be reluctant to enter their data. The ability to reuse templates and previous experiments can help to ease the burden of entering duplicate information for each new experiment. Compatibility with tablets allows for mobility in the lab when bench space is already at a premium.

Most instruments in the lab can be integrated to automate or facilitate simpler data capture but creating a custom integration for every instrument in the lab would cost more time and money than it would be worth. Most instruments can be grouped into a few categories of electronic capture methods, so flexible tools to capture data via these methods can serve the purpose without the custom development work. In many cases, the simpler option is the better one because it results in an easier to use system.

As you move from earlier to later phase development and pilot plant operations, compliance becomes a major factor for users as well. Paper-based systems often require each entry to be verified because they are so error prone. By moving to an electronic system, you can engineer out some of these processes, such as transcribing from instruments or performing calculations. You could take it a step further and flag potential deviations so that users can see there is a potential issue – all of which ensures the data being entered is of a high quality.

Simplifying data capture in an electronic interface can save users hours per week over using disparate notebooks, Excel and Word files. Very few scientists would still prefer to cut and paste into a paper lab notebook, given the option. Once the data is in the system, the real fun begins.

Getting data out

The push for quality by design (QbD) has only added to the importance of having easy access to data. Where traditional ELNs typically don’t stack up is on the analysis side. It’s very easy to get data into the system but, without some structure, it can be impossible to query and analyze that data. The ability to compare across runs, link results to a sample across multiple assays and users; and generate process reports all require some level of structure.

Simple analysis tools for performing calculations and charting are key, but the ability to port data to other analysis software can be just as important. No single software package can contain all of the functionality that every scientist requires. If one did, it would likely be complex to the point of unusable, so a combination of analysis tools and exportability of data are typically requirements of the modern scientist.

Basic ELNs may provide links between experiments, but this is only a small step above manually linking via notebook page numbers. Reporting and analysis are the top benefits that customers find when moving to a more advanced electronic data management system. Electronic data linking and collation for reports can save users days over the traditional route of cross-referencing notebooks and files to pull data together. It also removes the need to reformat or transcribe data when performing aggregate analysis.

The traceability gap

With quality data in the system, and easy access for queries and reporting, there is still a key consideration that is typically overlooked. In development, traceability is key. Linking materials, solutions, cell lines, process intermediates, samples and final products is a complex web that is especially difficult in a development environment, where splitting and pooling of material is commonplace. Often this is done by referencing notebook pages or a complex ID string that requires a decoder ring to understand. Automatically linking entities together in the system allows for traceability when it is needed most.

Supposing a batch of material was discovered to be bad – perhaps made with an expired component or due to miscalculated amounts – scientists would have to figure out where that lot was used and every operation and subsequent material that may be affected. I’ve done these types of investigations and it can take weeks of combing through notebooks, binders and various files before you have any idea of the extent of the issue. This is where traceability becomes critical, and paper-based systems just don’t cut it. The ability to query a component or material and see where that has been used, and any subsequent downstream operations, can now save weeks of lost time (which could be used to rerun the bad batches).

Traceability is also essential for samples. Submitting samples for analysis and accessing the corresponding results is typically a very low-tech process. Anything from paper forms to emails to asking verbally for analysis to be run is often the norm. Electronic systems are capable of managing this information as well, giving analytical scientists consistent access to incoming sample requests and development scientists faster and more accurate access to their results.

The development data solution

Modern technology can help development organizations optimize processes, cut down on errors and deviations and generally save time and money. Most customers I’ve worked with agree that there is a lot of value in streamlining the handling of data, however there is no one-size-fits-all solution. Different groups have various needs, so evaluating the requirements and goals is key to finding the right fit. Putting the effort in up front to clearly define the needs and benefits will help to build the business case, as well as find the solution. One thing is for certain, next generation informatics tools will be critical to the process development scientists of the future.

For more information, visit

One thought on “Biologics Development: How do you Solve a Problem Like Data Management?

Leave a Reply

Your email address will not be published. Required fields are marked *