Batch processing is the procedure by which computers automatically complete batches of jobs in sequential order with minimal human interaction. It is named after the process whereby batches of punched cards were queued to load the data into the mainframe’s memory in order to process the data. As the term has changed over the years, the concept today goes by many names, including job scheduling and workload automation, to name but two. These concepts have made batch processing more sophisticated and efficient and include new disruptive technologies such as Cloud Computing, Big Data, AI and IoT.
The Evolution of Batch Processing
Processing batch jobs has been a major IT concept since the early years of computing technology, establishing the basis for the future concepts of job scheduling and workload automation and tightly bound to the evolution of the computing industry as whole.
In the early days, there were mainframe computers and something called a batch window. Since all computing activities were carried out on mainframes, finding the best way to utilize this finite (and limited) resource was absolutely critical. Batch windows were nightly periods during which large numbers of batched jobs were run offline, a practice that was developed to free up computing power during the day. This allowed end-users to run transactions without the system drag caused by high-volume batch processing. In general, mainframe workloads were classified into batch workloads and online transaction processing (OLTP) workloads.
Batch jobs were used for processing large volumes of data for routine business processes such as
- Monthly billing, payrolls, etc.
- Online transactions (OLTP)
- Online (interactive) transactions were able to handle users’ action-driven interactive processing, meaning users no longer needed to wait for processing, but instead received an immediate response when requests were submitted.
In past years, IT operators would submit batch jobs based on an instruction book that would not only tell them what to do, but also provided them with directions on how to handle certain conditions — such as when things went wrong. Jobs were typically scripted using so-called job control language (JCL), a standard mainframe computer concept for defining how batch programs should be executed. For a mainframe computer, JCL:
- Identified the batch job submitter
- Defined what program to run
- Stated the locations of input and output
- Specified when a job was to be run
When one finished running a batch, the operator would submit another. Early batch processes were meant to bring some level of automation to these tasks. While the approach was effective at first, it quickly became problematic as the number of machines, jobs, and scheduling dependencies increased.
Limitations of Batch Processing
As batch processing was developed long ago, it comes with limitations. This is especially true when compared to modern enterprise job scheduling solutions. Among other things, restrictions include:
- Considerable manual intervention required to keep batches running consistently
- Inability to centralize workloads between different platforms or applications
- Lack of an audit trail to verify the completion of jobs
- Rigid time-based scheduling rules that don't account for modern "real-time" data approaches
- No automated restart/recovery of scheduled tasks that fail
The practice of creating automated jobs using a batch process is still used today. However, the technology and sophisitication of running automated jobs has substantically improved over time. Modern approaches remove the rigid time-based scheduling component of batch processing. Today, enterprises have the ability to schedule jobs, tasks and complete workflows using event-based triggers. This allows an organization to run their business automation in real-time. Ready for more? Learn about the difference between a "static" batch-process vs a "dynamic" event-driven approach.