How to Write Unit Tests for COBOL Programs That Have Never Had Any

COBOL testing is hard because legacy programs rely on global WORKING-STORAGE, hardcoded calls, embedded file I/O, and CICS runtime behavior, so standard unit-testing patterns from modern languages do not apply cleanly.

Somewhere in your organization, a batch COBOL program processes millions of transactions every night. It was written in 1994. It has been modified 347 times. It has zero tests.

You know it works because it ran last night. That is the entire safety net. The moment someone asks you to change a calculation, add a field, or refactor a paragraph, you are flying blind. One wrong MOVE statement and the overnight batch abends at 2 a.m., downstream feeds stall, and the on-call team pages you.

Why Standard Unit Test Patterns Don't Work in COBOL

If you come from Java, Python, or C#, your instinct is to isolate a method, mock its dependencies, and assert the output. COBOL does not let you do that. The language was not designed for testability, and the runtime environment actively works against isolation.

WORKING-STORAGE is global state. WORKING-STORAGE SECTION is the area in a COBOL program where variables are declared and initialized. Every variable lives there. Every paragraph in the program reads and writes to it. There is no concept of local scope. When paragraph A modifies WS-TOTAL-AMT, paragraph B sees that change immediately. You cannot test paragraph B without knowing what paragraph A did to shared memory.

Under CICS, the problem is worse. CICS GETMAINs a separate copy of WORKING-STORAGE for each task, but that storage is bounded by Storage Check Zones (SCZ). If your program overflows a table or writes past a field boundary, it can corrupt the SCZ. CICS will not detect this until it performs a FREEMAIN or the task ends, at which point you get a storage violation that is nearly impossible to trace back to the offending statement.

No dependency injection. COBOL programs call external subroutines with CALL statements and read files with READ verbs. These dependencies are hardcoded. There is no interface to swap in a test double. If your program calls CALC-INTEREST, it calls exactly CALC-INTEREST, and that subroutine better exist at link time.

File I/O is baked into business logic. A typical COBOL program opens a sequential file, reads records in a PERFORM UNTIL loop, applies business rules, and writes output records. The file operations are not abstracted behind a data-access layer. They are three lines away from the business logic you want to test.

These constraints mean you cannot adopt standard xUnit patterns without adapting them. You need COBOL-specific tooling.

The Three Tools You Need to Know

ZUnit (IBM). The z/OS Automated Unit Testing Framework, part of IBM Developer for z/OS (IDz). ZUnit generates test case programs from your source, lets you create stub programs for external subroutines and file I/O, and runs tests on z/OS or in a local emulated environment. It integrates with the IDz Eclipse-based IDE and the newer VS Code extension (TAZ). Use ZUnit when your programs run on z/OS and your shop already has IDz licenses.

MFUnit (Rocket Software / Micro Focus). The Micro Focus Unit Testing Framework ships with Visual COBOL and Enterprise Developer. It runs tests as compiled COBOL programs, supports Test Explorer integration in VS Code, and provides the mfupp preprocessor for programs containing EXEC CICS or EXEC SQL statements. Use MFUnit when your team develops on Windows or Linux workstations and deploys to a Micro Focus runtime or mainframe.

GnuCOBOL with GCBLUnit. GnuCOBOL is an open-source COBOL compiler that translates COBOL to C, then compiles to native executables. GCBLUnit is a lightweight test runner written in COBOL for GnuCOBOL. It supports assertions, JUnit XML reporting, and CI integration. No mainframe required. Use GCBLUnit for batch-only programs that do not depend on CICS, IMS, or DB2, or for proof-of-concept test suites before investing in licensed tooling.

Setting Up ZUnit

The following steps assume you have IBM Developer for z/OS (IDz) 15.0 or later, or the TAZ VS Code extension. The process has four stages documented in IBM's ZUnit guide.

Step 1: Configure the Property Group. Open your property group settings and enable the ZUnit options. Set the compiler options for the test case (TEST, NODYNAM at minimum). Specify the target load library where compiled test cases and stubs will reside. Set the runtime options including the test runner invocation parameters.

Step 2: Generate and Record the Test Case. Right-click your COBOL source program and select z/OS Automated Unit Testing Framework (ZUnit) > Generate Test Case. IDz creates a test case program along with a configuration file and, for CICS/Db2 programs, a playback file. For programs with live middleware dependencies, ZUnit supports a recording mode: you execute the actual transaction against a running CICS or Db2 environment while ZUnit captures the input/output data. That recorded data is then replayed in subsequent test runs without requiring a live environment. For pure batch programs, you populate the input data fields manually in the Test Case Editor.

Step 3: Write Stub Programs. For every external CALL target and every file I/O operation in your subject program, create a stub. A stub is a minimal COBOL program with the same PROGRAM-ID as the real subroutine. It accepts the same parameters but returns hardcoded or controlled test data instead of calling a real database or reading a real file. At link time, the linker resolves the CALL to your stub instead of the production module.

Step 4: Build and Run. Compile the subject program, the test case, and all stubs. Link them together. Run the test case using the ZUnit runner (the modern runner is BZUPLAY, the IBM z/OS Dynamic Test Runner). The framework executes the test entry point, calls into your subject program, and compares actual output fields against expected values. Results appear in the IDE with pass/fail indicators.

IDENTIFICATION DIVISION.

PROGRAM-ID. TEST-CALC-INTEREST.

Conceptual test case for CALC-INTEREST
(Production ZUnit uses AZUTCINI/AZUTCADD API)

DATA DIVISION.

WORKING-STORAGE SECTION.

01 WS-PRINCIPAL PIC 9(9)V99 VALUE 100000.00.

01 WS-RATE PIC 9(3)V99 VALUE 005.50.

01 WS-TERM-MONTHS PIC 9(3) VALUE 012.

01 WS-RESULT PIC 9(9)V99 VALUE ZEROS.

Simple annual interest: 100000 x 5.5 / 100 = 5500.00

01 WS-EXPECTED PIC 9(9)V99 VALUE 005500.00.

PROCEDURE DIVISION.

TEST-ENTRY.

MOVE 100000.00 TO WS-PRINCIPAL

MOVE 005.50 TO WS-RATE

MOVE 012 TO WS-TERM-MONTHS

CALL 'CALC-INTEREST' USING

WS-PRINCIPAL

WS-RATE

WS-TERM-MONTHS

WS-RESULT

IF WS-RESULT NOT = WS-EXPECTED

DISPLAY 'FAIL: Expected ' WS-EXPECTED

' Got ' WS-RESULT

MOVE 8 TO RETURN-CODE

ELSE

DISPLAY 'PASS: CALC-INTEREST correct'

MOVE 0 TO RETURN-CODE

END-IF

STOP RUN.

The Four ABENDs Every Test Suite Must Catch

An ABEND (abnormal end) is a mainframe program crash. Your test suite should include scenarios that deliberately trigger the four most common ABENDs. If your tests catch these before production, you have already prevented the majority of overnight batch failures.

S0CB: Divide by zero. Triggered when a DIVIDE statement has a divisor of zero. Test scenario: pass zero into any divisor field and verify the program handles it with an ON SIZE ERROR clause instead of crashing.

S0C7: Data exception. Triggered when the program performs arithmetic on a field that contains non-numeric data (spaces, low-values, or garbage). Test scenario: initialize a numeric PIC 9 field with spaces and call the paragraph that processes it. If the program abends, it needs input validation.

S322: Timeout (infinite loop). Triggered when a job exceeds its TIME parameter on the JCL (Job Control Language, the job-scheduling syntax on z/OS). Test scenario: set a short TIME limit in your test JCL and feed data that causes a PERFORM UNTIL loop to never satisfy its exit condition.

S0C4: Storage violation. Triggered when the program reads or writes memory it does not own. Test scenario: populate a table past its OCCURS boundary and verify the program checks the index before the MOVE. Under CICS, this is the ABEND that corrupts Storage Check Zones and causes cascading failures.

The Open Mainframe Project's LFX Mentorship labs from 2025 provide worked examples for S0CB, S0C7, and S322, complete with JCL, COBOL source, and step-by-step debugging instructions. These are the best free resources for learning to reproduce and diagnose these ABENDs. Note: the 2025 mentorship contribution set did not include an S0C4 lab.

Isolating WORKING-STORAGE Before Testing

The core challenge of COBOL unit testing is state. Every test case inherits the WORKING-STORAGE state left behind by the previous test case (or by the program's VALUE clauses at load time). If test A sets WS-CUSTOMER-TYPE to 'P' and test B assumes it is spaces, test B will pass or fail depending on execution order. That is not a test suite. That is a random number generator.

Use INITIALIZE aggressively. The INITIALIZE statement resets a group item or elementary item to its default: spaces for alphanumeric fields, zeros for numeric fields. Before each test entry point, INITIALIZE every group item in WORKING-STORAGE that the subject program will touch.

BEFORE-EACH-TEST.

INITIALIZE WS-INPUT-RECORD

INITIALIZE WS-OUTPUT-RECORD

INITIALIZE WS-WORK-FIELDS

INITIALIZE WS-COUNTERS

SET WS-EOF-FLAG TO FALSE

MOVE ZEROS TO WS-RETURN-CODE.

Use SET for 88-level switches. INITIALIZE does not affect 88-level condition names, and this is by design. 88-level items occupy no storage of their own; they are labels on the parent field's value. INITIALIZE operates on storage, so it resets the parent field (to spaces or zeros), which may or may not satisfy the 88-level condition. You must SET each flag explicitly to guarantee its state. If your program has a flag like 88 WS-EOF VALUE 'Y', you must SET WS-EOF TO FALSE before each test.

Stub programs for file I/O. You cannot INITIALIZE a file. Instead, write a stub program that replaces the file READ operation. The stub returns a hardcoded record (or a sequence of records from a WORKING-STORAGE table) and sets the file status to '00' for success or '10' for end-of-file.

What a Minimal Viable Test Suite Looks Like

A critical system with less than 10% unit test coverage presents a high risk of regression with every deployment. That is not an opinion. It is a measurable risk factor that shows up in incident post-mortems, audit findings, and modernization feasibility studies.

Ten percent coverage is where you start. It is not where you stop. But getting from zero to ten percent is the hardest part, because you have to build the infrastructure: the test runner configuration, the stub library, the CI pipeline integration, and the team's muscle memory for writing tests.

Which programs to test first. Sort your COBOL inventory by two axes: churn (how often the source changes) and business criticality (what breaks if this program fails). The programs in the top-right quadrant of that matrix get tests first. A program that changes monthly and processes payments is more urgent than a program that has not changed in five years and generates a weekly report.

The coverage threshold that unblocks modernization. Most modernization vendors and internal architecture boards will not approve a refactoring effort against a program with zero tests. The threshold varies, but in practice, having tests that cover the primary happy path plus each major error branch (the four ABENDs above, plus business-rule edge cases) is enough to unblock the first wave of changes. You do not need 80% line coverage. You need enough to catch the regressions that would stop a deployment.

Testing Programs with CICS Calls

Programs that run under CICS (Customer Information Control System, IBM's online transaction processor) add another layer of complexity. They contain EXEC CICS statements that read and write to temporary storage queues, issue LINK and XCTL calls to other programs, send and receive BMS maps, and manage conversational state through COMMAREA (communication area).

You cannot unit test EXEC CICS statements directly. They require a running CICS region, transaction definitions, and resource table entries. But you can remove them from the equation and test the COBOL logic that surrounds them.

The mfupp preprocessor (Micro Focus Unit Test Seam pre-processor). This tool, part of the Rocket Software / Micro Focus testing framework, processes your COBOL source before compilation and handles EXEC CICS statements in two ways: it can ignore them entirely (removing the CICS calls so you can test the surrounding logic) or it can insert mock callback statements that simulate CICS responses.

Test programs using mfupp are typically prefixed MFUPD_ by convention and can access the subject program's fields directly. The test runner uses internal metadata fields to register and dispatch test entry points. Note: specific internal identifiers such as MFUPP--INIT-CICS, MFUPP--END-CICS, and MFU-MD-EXEC-CONTROLLER are implementation details not documented in public-facing Rocket Software documentation and should not be treated as stable API surfaces.

What mfupp cannot do. It does not test the CICS application as a whole. It does not validate that your BMS map definitions are correct, that your COMMAREA layout matches the calling program, or that your temporary storage queue names resolve in the production CICS region. Those are integration tests. They require a running CICS environment and are outside the scope of unit testing.

For VS Code users, configure the MFUnit test discoverer in settings.json to enable Test Explorer integration:

{

"microFocusCOBOL.mfunit.discoverers": [

{

"globPattern": "bin/*.{dll,so}",

"is64Bit": false,

"runner": "mfurun"

{

"globPattern": "bin.net/*.dll",

"is64Bit": false,

"runner": "mfurunil"

}

]

}

On UNIX platforms, replace mfurun with cobmfurun (non-threaded) or cobmfurun_t (threaded). After configuring the discoverer, build your COBOL test sources. Any test programs matching the glob pattern will appear in the VS Code Test Explorer sidebar, where you can run and debug them individually.