How AI can help document legacy COBOL code, before it’s too late

COBOL is one of the oldest programming languages ​​still widely used to power mission-critical applications in many industries. A survey in February 2022 found that there are 775-850 billion lines of COBOL code in daily use.

But while COBOL is still used by many organizations, the number of COBOL developers continues to decline. After all, COBOL was first released in 1959 after six decades, so it wasn’t the most appealing language for new developers to learn. Perhaps even more worrisome is that as the current team of COBOL developers retires and resumes operations, real knowledge of how COBOL applications are built and structured may be lost.

A new AI-powered tool being developed by the newly established Phase Change Software, called COBOL Colleague, could be a solution to this challenge. The President of Phase Change Software said COBOL is not going away anytime soon. And while it’s hard to find a COBOL developer, that’s not the real problem.

The real underlying problem is the knowledge of what the applications do because to change code effectively, you need to understand what the code does,”

Simply having the ability to make code changes isn’t enough. What is needed is knowledge about code, which is an issue across many programming languages, though it is an acute problem for COBOL.

With COBOL being 60 years old, we’re not just having people leave, we’re having people permanently retire and that knowledge is simply not available,”

The growing market for AI-powered coding tools

There are a growing number of development tools that claim to use AI to help developers be more efficient.

There are low-code and no-code tools that use AI to help organizations write new code and build applications, without needing to first learn a programming language. Then, there are also advanced tools that help developers write code in actual programming languages, including the popular GitHub Copilot service.

GitHub Copilot is a code suggestion tool that is very different from what his company is building. He noted that Copilot will help developers to write code, but it doesn’t help developers to maintain code after it has already been written.

“We’re in the change the code business, not in the creation of the code business, and that’s one of the big differences,”

How COBOL Colleague uses AI

Phase Change Software is not taking the typical machine learning approach for its AI that requires training on a data set.

“Part of the complication when it comes to source code is getting a source code repository large enough to be able to train on,”

The other challenge is that path explosion. In code development, an operation can go down any number of different paths with the use of different ‘else’ functions. With an ‘else’ function an operation can change depending on different variables or conditions. With the potential of path explosion for training data, the number of permutations is astronomical and just not feasible for a typical machine learning training model.

“We solved the problem with AI techniques around symbolic machine learning, So there is no training data set; the only input to our tool is the source code.”

Symbolic AI is all about learning in an approach closer to how humans reason about the world in a cause-and-effect manner. that behavior in code is cause and effect, with inputs and outputs.

“So, if you can turn the computation into a cause-and-effect model, then you can use techniques from cognitive science and AI to reason on that internal representation,”

The Phase Change COBOL Colleague software takes COBOL source code and uses symbolic machine learning and static analytics techniques to turn the code into a cause-and-effect model. That model can then help organizations to understand and maintain code.

COBOL Colleague will initially be available for deployment in on-premises environments, running on Linux. that the largest corpus of COBOL code remains on-premises and is valuable intellectual property and isn’t likely to move off-site.

While COBOL is Phase Change’s first target, over time the company could well expand to support other programming languages as well.

 Software developers spend 80% of their time trying to figure out where in the code they need to make a change, The steps to do that are the same regardless of programming language and that’s what we’re automating.”