AI-Driven Machine Vision Systems with NVIDIA GPU Performance

CXP Family

BitFlow Claxon Frame Grabbers accelerate AI-Driven machine vision systems with NVIDIA GPU performance. Full CoaXPress 2.0 throughput and direct GPU integration put real-time AI Inference within reach of vision engineers.

BitFlow, a world leading manufacturer of industrial frame grabbers and a division of Advantech, has announced the full production availability of its Claxon CXP-12 frame grabbers purpose-built for high-speed machine vision systems integrating NVIDIA GPU-accelerated AI inference. The Claxon lineup spans five models — from the single-link Claxon CXP1 and quad-link Claxon CXP4 to the long-reach Claxon Fiber — giving system designers flexible solutions for every deployment scenario.

Artificial intelligence vision pipelines running on NVIDIA GPUs can deliver hundreds to thousands of TOPS in low-precision inference, enabling hundreds of images per second on real-time models like YOLO, with somewhat lower throughput on models like RetinaNet depending on the specific GPU and optimizations. Yet even the fastest GPU will sit idle if image data cannot reach it quickly enough. Standard GigE Vision is limited to ~1 Gbps, while 10 GigE Vision tops out at 10 Gbps. USB3 Vision offers only a fraction of that. CoaXPress 2.0 (CXP-12), by contrast, delivers 12.5 Gbps per link over standard coaxial cable, with link aggregation scaling to 50 Gbps across four links — five times the throughput of 10 GigE Vision and without the variable latency and overhead of Ethernet network stacks.

BitFlow’s Claxon frame grabbers are built on the CXP-12 standard from the ground up. Each board implements the full CoaXPress 2.0 specification, not a subset, which means system designers gain access to every capability the standard offers: simultaneous multi-camera capture, Power Over CoaXPress (13W per link), a 41.6 Mbps low-speed uplink for camera control, and full GenICam support for standardized camera configuration. The result is a deterministic, high-bandwidth data path that feeds GPU memory at the rate modern AI models demand.

“A machine vision system designer can only experience the full potential of CXP 2.0 by having access to all its capabilities,” Donal Waide, Director of Business Development – iService, BitFlow. “With the Claxon family paired to an NVIDIA GPU platform, the image data path finally matches the inference engine’s appetite. There is no artificial ceiling on throughput.”

One Architecture. Every Application.

The BitFlow Claxon family’s half-size, low-profile x8 PCIe Gen 3 form factor slots directly into standard workstations, compact industrial PCs, and NVIDIA GPU platforms. In addition, Advantech offers several AI inference systems and industrial computers that are designed to be compatible with BitFlow frame grabbers, particularly for high-speed machine vision applications. Several Advantech AI inference systems and industrial computers feature NVIDIA GPUs or support NVIDIA GPU expansion for processing.

Engineers selecting a Claxon board choose the channel count that matches their camera configuration, not a board constrained by arbitrary feature cuts:

Claxon CXP1 — Single-link CXP-12 at 12.5 Gbps. The right choice for high-resolution single-camera inspection cells where PCIe slot count is at a premium.

Claxon CXP2 — Dual-link CXP-12 supporting two single-link cameras simultaneously at 12.5 Gbps each, or one dual-link camera at a combined 25 Gbps. Ideal for stereo vision, dual-angle inspection, and 3D reconstruction workloads.

Claxon CXP4 — Quad-link CXP-12 supporting four single-link cameras (12.5 Gbps each), two dual-link cameras (25 Gbps each), or one quad-link camera at 50 Gbps aggregate. With four cables and four CXP-12 cameras active, the maximum data transfer rate reaches 5 GB/s — the highest single-board acquisition rate in the industry. This board is available in two options to accommodate current demand. Both options, while identical in format, make use of an Altera and a Xilinx FPGA design.

Claxon CXP4-V — Architecturally identical to the CXP4 but equipped with active ventilation to manage FPGA thermal output in small-form-factor, fanless industrial computers where natural airflow is insufficient. A field-proven design for embedded AI vision nodes deployed in harsh environments.

Claxon Fiber (CoF) — Extends CoaXPress over QSFP+ fiber cable assemblies, supporting one quad-link, two dual-link, or four single-link CoF cameras at distances exceeding one mile. Fiber is immune to electromagnetic interference, making the Claxon Fiber the preferred choice for vision systems operating near high-voltage equipment, in broadcast facilities, or within large-scale factory floors where coaxial cable runs are impractical.

From Camera to CUDA: Closing the Loop on AI Inference Latency

Claxon frame grabbers are engineered to integrate directly with NVIDIA GPU platforms, including systems built on the NVIDIA Jetson AGX Orin. In a validated reference design with Advantech’s MIC-733-AO AI edge computer, the Claxon CXP4 and Claxon Fiber boards connect to an optional PCIe x8 iModule that slots directly into the NVIDIA Jetson carrier, placing frame-accurate image data on the GPU memory bus with minimal CPU intervention. NVIDIA TensorRT inference pipelines then operate on live image buffers without the memory-copy overhead that plagues USB or GigE-based designs.

For data center-class GPU workstations running NVIDIA RTX or Data Center GPU families, the Claxon’s PCIe Gen 3 x8 bus interface feeds image data directly into GPU-accessible system memory via BitFlow’s SDK, which supports buffer management APIs in C, C++, C#, and Python. Computer vision engineers can pipe raw image buffers into CUDA processing kernels, PyTorch data loaders, or TensorRT execution engines with minimal latency added at the acquisition stage.

Software That Doesn’t Make Integration a Second Job

The BitFlow SDK supports both Windows and Linux, covering the breadth of operating environments found in modern AI vision deployments. Drivers are available for leading third-party vision environments including HALCON, LabVIEW, VisionPro, and MATLAB. Full GenICam compliance means any GigE Vision-standard camera configuration tool works with Claxon-connected cameras out of the box. Critically, the Claxon’s architecture is a direct evolution of BitFlow’s prior-generation Cyton CXP platform — users migrating from CXP 1.1 systems can swap in Claxon boards and retain existing software without rewriting acquisition code or reconfiguring triggering logic.

BitFlow CoaXPress Frame Grabber Aids in SuperKEKB Particle Accelerator Beam Failure Troubleshooting

 The SuperKEKB particle accelerator in Tsukuba, Japan, was constructed to achieve the highest particle collision rates in the world, enabling next-generation investigation of fundamental physics. SuperKEKB is unique in its employment of a nano-beam scheme that squeezes beams to nanometre-scale sizes at the interaction point, along with the use of a large crossing angle between the colliding beams to enhance electron–positron collision efficiency.

In its quest to reach the world’s highest collision rates, SuperKEKB has repeatedly suffered from Sudden Beam Loss (SBL) events. An SBL event occurs when vertical beam current is reduced by ten percent or more, leading to the process being aborted within a few turns lasting only 20 to 30 milliseconds. It is unknown what specifically invokes an SBL event. According to one theory, beam orbit oscillation causes beam sizes to significantly increase a few turns before an SBL occurrence. Yet it was also observed size escalation started earlier than beam oscillation. Increases have been measured to be up to ten times larger than the usual beam size.

SBL is the biggest obstacle to the longterm stability of SuperKEKB beam operation. It also has the potential to seriously harm accelerator components within the electrons or positrons rings, which are situated side-by-side within a tunnel. Determining the source behind SBL incidents and putting suppressive measures in place were crucial.

IDENTIFYING THE ORIGIN OF SBL

To help uncover the root cause of SBL and ensure redundancy, the SuperKEKB team developed two turn-by-turn beam size monitors operating at different wavelengths; one, an X-ray system for beam size diagnostics, and the other, a visible light monitor focusing on beam orbit variation and size increases.

The 99.4 kHz revolution frequency of the particle accelerator made it necessary to use imaging components compliant with the CoaXPress 2.0 (CXP-12) high-speed standard. In both the X-ray and visible light systems, data transfer rates up to 50 gigabits per second were achieved by aggregating four links between a Mikrotron EoSens 1.1 CXP2 CMOS camera and a BitFlow Claxon CXP4 PCIe quad link frame grabber. During data acquisition, the Mikrotron’s camera shutter was operated in precise synchronization with SuperKEKB’s 99.4 kHz revolution frequency. Captured image data was continuously stored in the BitFlow frame grabber’s 2GB ring buffer. It was only when a beam aborted did the data in the ring buffer move to the disk server for offline analysis.

The Claxon CXP4 is also capable of handling 4 x 1-link cameras, 2 x 2-link cameras or any combination of these.  Each link supports data acquisition of up to 12.5 Gb/s. The highly deterministic, low latency frame grabber will also provide a low speed uplink on all links, accurate camera synchronization, and 13W of Safe Power to all cameras per link.

By reducing the size of the camera’s Region-of-Interest (ROI), the X-ray monitoring system captured 99,400 frames per second, while the visible light system used an ROI twice the size of the X-ray, operating at a speed of 49,700 frames per second. The beam profile was measured with one shot every two turns instead of every turn.

DIFFERENTIATING BEAM PATTERNS

The frame grabber’s CXP-12 transmission speeds empowered SuperKEKB physicists to accurately differentiate between the various beam patterns developing before SBL events occurred.

Combining observations from both the X-ray and visible light monitoring systems, a possible SBL event scenario evolved. Physicists theorized changes in the beam orbit may lead to a sudden increase in vacuum pressure in the damping section of the SuperKEKB with irradiation being the possible source. In this theory, when the beam hits a vacuum component, such as a beam collimator, the result is a sudden loss in beam current and an SBL event. However, this has not been fully clarified. To explore other possibilities, SuperKEKB is developing more advanced X-ray beam-size monitors that combines a silicon-strip sensor with a powerful ADC.

Visible light beam size monitor showing four cables connected to a Mikrotron CXP-12 camera running into a BitFlow Claxon CXP4 PCIe quad frame grabber to achieve 50GB/sec data transfer rates (Image courtesy of SuperKEKB)

Visible light beam size monitor showing four cables connected to a Mikrotron CXP-12 camera running into a BitFlow Claxon CXP4 PCIe quad frame grabber to achieve 50GB/sec data transfer rates (Image courtesy of SuperKEKB

Claxon CXP4

Claxon CXP4 frame grabber

BitFlow Fiber-over-CoaXPress Frame Grabber Integrated with NVIDIA TensorRT in Real-time Human Pose Estimation

WOBURN, MA, JANUARY 8, 2025 — In collaboration with its parent company, Advantech, BitFlow announced today that it has successfully integrated its Claxon Fiber-over-CoaXPress (CoF) frame grabber with an Advantech AI Inference edge computer and Optronis Cyclone Fiber 5M camera in developing a real-time human pose estimation project accelerated by NVIDIA TensorRT deep learning.

One of the most advanced of its kind, the pose estimation system can provide low latency analysis of athletic movement, gaming, physical therapy, AR/VR, fall detection, and online coaching. Traditional approaches to pose estimation required multiple cameras and special suits with markers, rendering it impractical for most applications. AI-driven computer vision has elevated this field where a single camera can now capture professional-grade, real-time pose estimation. 

With a processing time of less than 2 milliseconds, the system is capable of acquiring 2560 x 1916 resolution images at 600 frames-per-second. Once output to the BitFlow Claxon CoF frame grabber, the Claxon’s Direct Memory Access transmits images directly into the Advantech computer’s GPU memory, reducing bottlenecks and freeing up the CPU to apply an NVIDIA pre-trained algorithm that searches each frame for people. If the algorithm locates a person, it calculates a crude skeleton location and overlays the displayed image with a “stick figure” representing the person’s bone structure.

The Advantech MIC-733-AO AI edge computer is embedded with an NVIDIA Jetson AGX Orin that natively supports the NVIDIA TensorRT ecosystem of APIs for deep learning inference. An optional PCIe x8 iModule is available for the MIC-733-AO to accommodate BitFlow CoaXPress and Camera Link frame grabbers. 

High throughput demands of the system required the use of the BitFlow Claxon CoF model. Designed to extend the benefits of CoaXPress over fiber optic cables, the Claxon Cof is a quad CXP-12 PCIe Gen 3 frame grabber that supports all QFSP+ compatible fiber cable assemblies. In addition to high speeds, fiber cables are immune to EMI and is capable of running lengths well over a kilometer, further than Ethernet’s 100 meter limitations.

Human pose estimation image
Real-time human pose system incorporating BitFlow CoF frame grabber, Advantech AI edge computer, and Optronis fiber camera, accelerated by NVIDIA TensorRT deep learning

Google Healthcare Relies on BitFlow CoaXPress Frame Grabber for Augmented Reality Microscope

WOBURN, MA, AUGUST 6, 2020 – BitFlow frame grabber technology has been incorporated into a prototype Augmented Reality Microscope (ARM) platform that researchers at Google AI Healthcare (Mountain View, CA) believe will accelerate the adoption of deep learning tools for pathologists around the world in the critical task of visually examining both biological and physical samples at sub-millimeter scales.

The application driving the ARM platform runs on a standard off-the-shelf computer with a BitFlow Cyton CoaXPress (CXP) 4-channel frame grabber (CYT-PC2-CXP4) connected to an Adimec S25A80 25-megapixel CXP camera for live image capture, along with an NVidia Titan Xp GPU for running deep learning algorithms. Using Artificial Intelligence (AI), the platform enables real-time image analysis and presentation of the results of machine learning algorithms directly into the field of view.

Importantly, the ARM can be retrofitted into existing light microscopes found in hospitals and clinics around the world using low-cost, readily-available components, such as the BitFlow Cyton frame grabber, and without the need for whole slide digital versions of the tissue being analyzed. This innovation comes as welcome news: despite significant advances in AI research, integration of deep-learning tools into real-world diagnosis workflows remains challenging because of the costs of image digitization and difficulties in deploying AI solutions in microscopic analysis. Besides being economical, the ARM platform is application-agnostic and can be utilized in most microscopy applications.

According to Google researchers, opto-mechanical component selection were driven by final performance requirements, specifically for effective cell and gland level feature representation. The Adimec camera’s 5120×5120 pixel color sensor features high sensitivity and global shutter capable of capturing images at up to 80 frames/sec, while the BitFlow Cyton CXP-4 has a universal PCI-E interface to the computer that simplifies set-up. The eMagin SXGA096,1292×1036 pixel microdisplay mounted on the side of the microscope includes an HDMI interface for receiving images from the computer. This opto-mechanical design can be easily retrofitted into most standard bright field microscopes. Including the computer, the overall cost of the ARM system is at least an order of magnitude lower than conventional whole-slide scanners, without incurring the workflow changes and delays associated with digitization.

The basic ARM pipeline consists of a set of threads that continuously grab an image frame from the camera, debayer it to convert the raw sensor output into an RGB color image, prepare the data, run the deep learning algorithm, process the results, and finally display the output.

Google researchers believe that the ARM has potential for a large impact on global health, particularly for the diagnosis of infectious diseases, including tuberculosis and malaria, in developing countries. Furthermore, even in hospitals that will adopt a digital pathology workflow in the near future, ARM could be used in combination with the digital workflow where scanners still face major challenges or where rapid turnaround is required as is the case with cytology, fluorescent imaging, or intra-operative frozen sections.

Since light microscopes have proven useful in many industries other than pathology, the ARM can be adapted for a broad range of applications across healthcare, life sciences research, and material science. Beyond the life science, the ARM can potentially be applied to other microscopy applications such as material characterization in metallurgy 12 and defect detection in electronics manufacturing.

BitFlow Introduces SDK for NVIDIA Jetson AGX Xavier Development Kit

Jetson with a Claxon

BitFlow has released a Linux AArch64 (64-bit ARM) SDK that enables seamless integration of BitFlow frame grabbers with the NVIDIA Jetson AGX Xavier Development Kit. 

Donal Waide, Director of Sales for BitFlow, states, “Many of our customers are already using GPU solutions such as NVIDIA for image processing so adding this option to the already large BitFlow suite of adapters was a natural progression for the company. BitFlow has been supporting Linux for several years across a variety of flavors.”

Added Waide, “BitFlow was one of the first frame grabber companies to support NVIDIA’s GPUDirect for Video technology. BitFlow and NVIDIA have worked together for a number of years already.” 

With the advent of the new machine vision standard CXP 2.0 where data rates are now up to 50 Gb/S, customers are looking to process more and more data and in shorter timeframes. For this, a GPU can typically perform these tasks much more effectively than a CPU. Even with slower data rates such as Camera Link’s (up to 850 MB/S) the ability to quickly process more complex algorithms is equally important. 

The NVIDIA Jetson AGX Xavier is the first computer designed specifically for autonomous machines. It has six Engines onboard for accelerated sensors data processing and running autonomous machines software, and offers the performance and power efficiency for fully autonomous machines.