IEEE Transactions on VLSI 2026 Research Papers

IEEE Transactions on VLSI 2026 Research Papers

IEEE Transactions on VLSI 2026 Research Papers, We are offering iee projects 2025-2026 in latest technology like Java, dot net, android, embedded, matlab, vlsi, hadoop, power elctronics, power system, mechanical, civil projects. IEEE Master is a unit of LeMeniz Technologies. We guide all final year M.E/M.Tech, B.E/B.Tech, MPhil, MCA, BCA, M.Sc, B.Sc, and Diploma students for their Academic Projects to get best results.

Why Us?

IEEE Project Technology

LOW POWER

S.No Code Title Year
1VLSI_2026_03
DBP-CIM: Energy-Efficient 8T SRAM-Based Diagonal-Block Parallel Computing-in-Memory With Compact Data Layout for Arithmetic Operations
An energy-efficient SRAM-based computing-in-memory architecture for general-purpose arithmetic operations is presented. The design introduces a diagonal-block parallel mapping scheme to improve hardware utilization and reduce vacant memory cells caused by irregular bit widths during computation. A hardware-efficient arithmetic flow with pipelined addition and shift-based multiplication and division reduces critical path delay and resource usage. Implemented using an 8T SRAM-CIM macro in 28-nm CMOS technology, the system achieves improved throughput and energy efficiency, reducing energy consumption and computing cycles by up to 55.9% and 60.9%, respectively, compared to conventional bit-parallel CIM architectures.
2026
2VLSI_2026_04
A 9T SRAM Computation-in-Memory Architecture With High-Precision MAC, Enhanced Bitline Voltage Margin and Improved Frequency Performance Over Conventional Architectures
To address the data-intensive demands of mod-ern artificial intelligence (AI) systems, computation-in-memory (CIM) based on static random-access memory (SRAM) has emerged as a promising solution by integrating computing functionality within memory arrays. However, conventional SRAM CIM architectures face two key limitations: low output resistance in single-transistor transmission paths and voltage instability on charge-sharing bitlines. These limitations col-lectively degrade computational accuracy to 4–5 LSB-level integral nonlinearity (INL), restricting practical deployment. This work proposes a regulated-cascode 9T SRAM cell that enhances analog computation accuracy using a high-impedance transmission path through a cascode configuration and sta-bilizing the discharge amount of the bitline from a single cell via active feedback regulation. Implemented in Semi-conductor Manufacturing International Corporation (SMIC) 55-nm CMOS technology, the proposed cell demonstrates 1.31 LSB INL at 400-mV bitline swing (68.4% improvement versus 4–5 LSB baselines), achieving 66.7% voltage utilization efficiency compared with the conventional 50% limit and 23.04% frequency improvement is achieved compared with the con-ventional architecture. It also achieves an energy efficiency of 18.47 fJ/bit and a compact area of 2.655 × 1.175 µm, while demonstrating a classification accuracy of 97.7% on the MNIST dataset.
2026
3VLSI_2026_05
A 25-Mb/s 4-ASK Receiver Front-End in 65-nm CMOS for Biomedical Data Telemetry via a Capacitive Link
This brief presents a 25-Mbps 4-amplitude-shift-keying (4-ASK) receiver front-end (RFE) for biomedical data telemetry via a series-resonant capacitive link. The RFE incor-porates low-power clock and data recovery (CDR) circuitry for synchronization in which a novel highly linear transconductance (Gm ) cell is employed in the phase detector (PD) to mitigate any possible error decisions while comparing the phase difference between the input and feedback signals. The proposed RFE is fabricated in 65 nm 1P8M standard CMOS, the core circuit occu-pies 0.11 mm2 , and consumes 2.9 mA from 1 V. While conducting ex vivo measurements using beef tissue and a series-resonant capacitive link, the proposed RFE is capable of processing 4-ASK data patterns up to 25 Mbps with bit error rate (BER) less than 10−3 and total jitter of ∼42 ns.
2026
4VLSI_2026_10
A Study of the Design Criteria for 4-Stage, Pseudo-Differential Ring Oscillators to Self-Start From Any Initial State
This paper is a study of the minimum set of sufficient conditions that allow a 4-stage pseudo-differential ring oscillator to self-start from any initial state and the corresponding design criteria that allow these conditions to be met. First, a mathemat-ical model that describes all the possible points of equilibrium is derived by exploiting the symmetry and regularity of the chain of gain stages that form the ring oscillator. The analysis then proceeds to determine the set of minimum conditions that guarantee that the ring oscillator can self-start. The result is a collection of equations that provides the proper design criteria to avoid any start-up failure. These criteria are only a function of the supply, the devices threshold voltage and the ratio between the forward and cross-coupled inverters. No further process technology or system parameters need be known. A graphic representation of the equations allows us to determine at a glance the region of guaranteed oscillations. Simulations that cover process and temperature corners of several 4-stage pseudo-differential ring oscillators built with different device sizes in a 7nm FinFET technology confirm the validity of the mathematical analysis, and the equations thereby derived.
2026
5VLSI_2026_12
An Area-Efficient Fractional Output Divider Based on Foreground DTC INL Calibration
This brief presents a fractional output divider (FOD) with a foreground digital-to-time converter (DTC) INL calibration scheme. This calibration scheme adjusts the delay control words of two main DTCs (mDTCs) to enable mutual comparison between them. By using a sign-least-mean-squares (sign-LMS) algorithm, the INL error codes are obtained and subsequently applied to a calibration DTC (cDTC) to compensate for the mDTC INL. The prototype occupies a compact core area of 0.01mm2 and operates at a 0.9V supply with a power consumption of 3.6mW at 500MHz. Measurements demonstrate an integrated jitter of 512fs (10kHz to 20MHz) and spur level of -70dBc at 123.46MHz.
2026
6VLSI_2026_13
A 6-GS/s 8-bit Time-Domain ADC With Selection-First Pipelined Successive Approximation Register TDC in 28-nm CMOS Technology
This brief presents a single-channel 8-bit time-domain analog-to-digital converter (TD-ADC) that employs a selection-first successive approximation register (SAR) time-to-digital converter (TDC) to address key limitations of prior TDC designs used in TD-ADCs. By adopting the selection-first approach, each bit decision requires only one reference delay path per bit, improving metastability tolerance compared to conven-tional computation-first designs. Moreover, the proposed TDC eliminates input-dependent errors and reduces the vulnerability to time-comparator mismatches observed in gate-based TDCs. Fabricated in 28-nm CMOS technology, the prototype TD-ADC achieves a signal-to-noise-and-distortion ratio (SNDR) of 36.4 dB for a Nyquist-rate input at 6 GS/s while consuming 51 mW.
2026
7VLSI_2026_14
Frequency Synchronization Techniques for DC–DC Converters With Time-Based Control
In this brief, a frequency synchronization method for time-based-controlled power converters is presented. The proposed synchronization technique makes it possible to precisely lock the internal oscillators frequency of the time-based controller (i.e., the power converter switching frequency) to any externally provided clock signal, without trading the dynamic performance. By leveraging the differential structure of the time-based con-troller, the proposed feedback loop operates orthogonally with respect to the main voltage regulation loop, minimizing any inter-action with the latter. The presented frequency synchronization scheme has been implemented on a time-based buck converter in a 180-nm BCD process to fully verify the performance.
2026
8VLSI_2026_15
A 109-dB SFDR Continuous-Time Delta-Sigma Modulator for Audio Using Bi-Directional Advancing Data Weighted Averaging
This brief presents a high-linearity third-order 3-bit continuous-time delta-sigma modulator (CTDSM) with 109 dB spurious-free dynamic range (SFDR) for audio applica-tions. The proposed modulator employs a cascade of integrators with feedforward structure using active RC integrators, a 3-bit flash quantizer, and resistor digital-to-analog converter (DAC) feedback. To address both DAC non-linearity and the tonal issues caused by conventional data weighted averaging (DWA), a new bi-directional advancing DWA (Bi-ADWA) with tone-suppressing and tone-transferring capabilities is introduced. Furthermore, the asymmetry in the rise/fall time of the DAC due to the additional paths introduced by Bi-ADWA is mitigated by optimizing the switch control circuitry and layout of the feedback DAC. Fabri-cated in a 180 nm CMOS technology, the modulator demonstrates a peak SNDR of 95.8 dB alongside a DR of 102.5 dB, consuming 0.58 mW at 1.8 V within an active area of 0.8 mm2 .
2026
9VLSI_2026_16
A Compact and Efficient 40 V Capacitive Level Shifter With Feedback Discharge Control for Enhanced CMTI and Low FoM for Bootstrapped Gate Drivers in BCD Technology
A high-speed, low-figure-of-merit capacitive floating level shifter (CF-LS) for high-side bootstrapped gate drivers is presented. The design integrates a feedback discharge control (FBDC) for enhanced V SW dV/dt (SRSW ) immunity and combines a mismatch reduction circuit (MMRC) with a common-mode blanking pulse (CMBP) to block common-mode (CM) signals with mismatch. Fabricated in a 0.75-µm STMicroelectronics bipolar-CMOS-DMOS (BCD) process, the proposed CF-LS occu-pies 0.048 mm2 with the core area being 0.026 mm2 , making it highly compact and integration-friendly. It achieves a propaga-tion delay (T P ) of 1.525 ns and transition energy (ET ) of 29.77 pJ at 40 V, both of which remain approximately constant across the V SW sweep from 0 to 40 V, while maintaining measured SRSW immunity of up to 20 V/ns and mismatch tolerance up to 30%. Compared to recent works, the design offers a 2× improvement in figure-of-merit2 (FoM2), achieving 2.68 ns·pJ/µm3 ·V at boot-strap supply (V BS ) of 5 V, and 3× improvement when supplied at 3.3 V. The CF-LS has been comprehensively validated both as a standalone circuit and when integrated within a dc–dc buck converter, making it a highly robust, efficient, and versatile solu-tion for high-frequency power management systems operating up to 55 MHz.
2026
10VLSI_2026_18
A Reconfigurable Built-In Self-Test Scheme for the Evaluation Circuits of Digital SRAM-IMC Architectures
Digital static random access memory-based in-memory computing (SRAM-IMC) is a promising computation paradigm to break the von-Neumann bottleneck. However, the IMC architectures also bring a series of challenges for testing, because of the circuit structures and operations that do not exist in the conventional memories. One of the challenges is the testing of evaluation circuits in the digital SRAM-IMC architectures, because the primary inputs (PIs) of the evaluation circuits cannot be directly accessed by the testers. Several test approaches such as the conventional logic built-in self-test (LBIST) modules, the indirect and the scan-chain-based test methods are proposed to address this issue. Nevertheless, these solutions suffer from the low test performance or the high area consumption. This work proposes a reconfigurable built-in self-test (BIST) scheme for the evaluation circuits. By reusing the IMC bitcells and operations, the proposed BIST scheme implements the separate pattern generation (PG) and response analysis (RA) processes. Furthermore, the diverse pattern generators, including the Fibonacci linear feedback shift register (LFSR) and weighted LFSR (WLFSR) with adjustable feedback polynomials and the cellular automata (CA), are realized to improve the test efficiency and fault coverage. The evaluation results show that the proposed BIST scheme has better test performance comparing with the indirect and the scan-chain-based test approaches. The proposed BIST scheme has comparable test performance, whereas it has much less area overhead comparing with the conventional LBIST schemes. Additionally, the proposed BIST scheme is testable and repairable.
2026
11VLSI_2026_20
A 40-nm Embedded Flash With Highly Reliable Bitline Transmission and Low-Voltage Current Sense Amplifier
A novel read circuit for embedded flash memory operating from a single 1.1-V supply is presented, featuring a negative-voltage dual-MOS transmission structure for bitline (BL) transfer and a low-voltage, high-reliability current–voltage sense amplifier (SA). This approach effectively addresses the issue of reduced read window size due to decreased BL voltage as supply voltage decreases. The proposed circuit is integrated into a 4.5-Mbit embedded flash memory test chip, fabricated using a 40-nm CMOS process. Experimental results show a read access time of 18.5 ns at a supply voltage of 0.9 V, with a read throughput of 7.78 Gbit/s and a bit density of 3.414 Mbit/mm2 .
2026
12VLSI_2026_22
A Fully Integrated Storage-Free Energy Harvesting System With Voltage Self-Regulation and Dual-Channel Power Extraction
With the growing demand for self-powered oper-ation in Internet of Things (IoT) nodes and microsensors, photovoltaic energy harvesting (EH) systems have become a key research focus. Although triple-well on-chip solar cells improve photovoltaic conversion efficiency, their multiport output char-acteristics pose design challenges: conventional parallel output schemes cannot address the mismatch in maximum power point (MPP) voltages among ports, thereby limiting energy extraction efficiency. Moreover, fully integrated EH systems lack external storage, and input–output coupling renders traditional voltage regulation unsuitable. To address these challenges, this arti-cle presents a fully integrated dual-channel EH system that employs a dual-port, independently boosted parallel architecture combined with a frequency-driven lightweight self-regulation mechanism, eliminating the need for conventional regulators. Measurement results show that the system achieves a stable out-put voltage of approximately 1.22 V, a peak end-to-end efficiency of 63.97%, and a 6.31% improvement in maximum output power, while also exhibiting reliable start-up and ripple suppression, providing valuable guidance for the design of efficient power extraction and self-regulation in on-chip photovoltaic EH systems.
2026
13VLSI_2026_31
A Tri-Band Two-Stage LNA With Simultaneous Linearity and Gain Enhancement for WiFi
This brief presents a tri-band, two-stage compact low-noise amplifier (LNA) that simultaneously enhances linearity, gain, and noise performance for WiFi applications. The second stage adopts a dual-path architecture, consisting of a main amplifier and an additional amplifier. The additional amplifier, biased in the subthreshold region, suppresses third-order nonlinearity and enhances gain without increasing power consumption. The first-stage LNA reduces the noise contribution from the second stage, improving overall noise performance. To further minimize power consumption, an inverter-based topology is employed. Fabricated in a 90-nm CMOS process, the proposed LNA achieves an S11 below −5 dB at 2.4, 5, and 6 GHz, covering key WiFi bands. At 6-GHz band, it delivers 13.5-dB gain, 3.2-dBm third-order input intercept point (IIP3), and 3.1-dB noise figure (NF). At 5-GHz band, it achieves 15-dB gain, 0.7-dBm IIP3, and 2.76-dB NF. At 2.4-GHz band, it provides 20.66-dB gain, −7-dBm IIP3, and 2.7-dB NF. The circuit consumes only 3 mW of dc power. Measurements at 6 GHz show that the dual-path technique in the second stage improves IIP3 by 8.6 dB, increases gain by 1.5 dB, and reduces NF by 0.6 dB, all without additional power or area overhead.
2026
14VLSI_2026_32
A 0.31-V 16-Kb 9T SRAM With Enhanced Sensing Margin and Read Performance for Low-Power Applications
This brief presents a low-power 9T static random access memory (SRAM) with enhanced read sensing margin and read perfor-mance. The read decoupled port of the proposed 9T SRAM cell achieves the enhanced sensing margin by mitigating the read bitline (RBL) leakage and improves the read performance through using one-transistor read path. The multithreshold voltage devices are used in SRAM cell for improving the leakage power and performance of SRAM. Additionally, an interleaved write wordline (WWL) structure is implemented to address the write half-select issue. The measurement results of the test chip fabricated in the 22-nm FDSOI technology demonstrate that the designed 9T SRAM achieves a minimum operation voltage of 0.31 V at 1.05 MHz and can operate at 60.5 MHz when the supply voltage is 0.5 V. The minimum active energy of 18.56 fJ/access-bit is obtained at 0.33 V. Furthermore, the designed SRAM exhibits a minimum leakage power of 0.11 pW/bitcell in the retention mode.
2026
15VLSI_2026_33
FADE: Fault-Aware Adaptive On-Die ECC for Improving Robustness
The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.
2026
16VLSI_2026_34
A Pattern-Dependent Pulse Filtering Technique for Low-Jitter Injection-Locked CDR in 28-nm CMOS
This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm2 and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UIpp at 35 MHz with a bit error rate (BER) of 10−12 .
2026
17VLSI_2026_35
A Low-Cost 0.28 mm2 Dual-Mode ASK Demodulator NFC Forum Type 5 Compatible Fully Integrated IoT Tag
With the increasing prevalence of NFC-enabled smartphones, next-generation IoT devices demand contactless information interaction capabilities. This paper presents an NFC tag chip architecture for large-scale IoT applications, achieving reliable data transmission through RF coupling. Addressing the critical challenge of balancing power consumption and cost in industrial deployment, the design features a dual-mode ASK demodulator (supporting 100% and 10% modulation depth) compliant with NFC Forum Type 5 specifications, achieving ultra-low power consumption of 83.06µW through dynamic clock division and gating techniques. For cost optimization, the archi-tecture employs a byte-level anti-collision protocol for memory structure optimization and replaces conventional comparators with inverter chains. Implemented in SMIC 0.13µm EEPROM process, the compact layout of 544.24µm × 516.60µm maintains unit cost below 1 cent. Experimental results demonstrate stable communication under maximum 9.44µs modulation gap through RC-delay pulse shaping and LDO with <7% ripple. This research provides an economical and reliable hardware solution for billion-scale IoT node deployment.
2026
18VLSI_2026_36
Detailed Study of Phase and Amplitude Imbalances in Quadrature LC CMOS Oscillators
Conventional analyses of phase and amplitude imbalances in quadrature LC CMOS oscillators are typically performed in the time-domain under large-signal conditions, where tank series losses are approximated as a fixed parallel resistance to simplify the analysis. However, this simplification fails to accurately capture the effects of practical tank losses, which are inherently represented as a series resistance in real LC oscillators. To address this limitation, a novel theoretical framework is developed for analyzing phase and amplitude imbalances in quadrature LC CMOS oscillators. The analysis is carried out in the phasor-domain using a modified small-signal circuit model (MSSCM). Unlike prior studies, the proposed approach explicitly incorporates the series resistance of the inductor as the dominant tank loss, resulting in a nonlinear formulation for phase and amplitude imbalances. Closed-form expressions for these errors are derived across three fundamental topologies: series-coupled, parallel-coupled, and source-injection-coupled accounting for Q variations. Additionally, the maximum steady-state oscillation amplitude under mismatch conditions is derived. The accuracy of the proposed model is validated through extensive simulations, showing strong agreement with numerical results. This comprehensive analysis offers deeper insight into the mechanisms underlying phase and amplitude imbalances and, ultimately, presents multiple techniques for the simultaneous compensation of these errors across different quadrature oscillator topologies.
2026
19VLSI_2026_37
A Complementary 3T-Based eDRAM Macro for High-Density Dual-Direction CAM and Logic-in-Memory
Content-addressable memory (CAM) is regarded as an attractive solution for data-intensive applications with high-density search demands. To further improve functional flexibility yet at a low cost, several CAM macros have been developed to support multiple bit-wise logic operations. However, conventional SRAM-based CAM designs are constrained by the large bitcell area, posing significant challenges to achieve higher density. To address this issue, we propose a complementary 3T (C3T) based embedded dynamic random access memory (eDRAM) macro for high-density dual-direction CAM searching and logic-in-memory operations. First, we propose a compact C3T bitcell featuring a pair of complementary decoupled read ports, enabling dual-port read and efficient CAM operations. Second, we present a compact dynamic-circuit-based sense amplifier (DSA) to optimize the area of readout peripheral circuitry while mitigating the read bit line saturation issue. Additionally, we implement dual-direction CAM searching and logic-in-memory operations exploiting the C3T-based eDRAM macro. A 4 Kb C3T-based eDRAM macro has been validated in a commercial 40-nm CMOS process. Post-layout results demonstrate a 53% reduction in the bitcell area and a 58.1% reduction in the macro area compared to the state-of-the-art 6T compute SRAM. Moreover, the proposed design achieves a maximum frequency of 578 MHz for binary CAM (BCAM) searching operations and 694 MHz for logic operations, with energy consumption of 1.12 fJ/bit and 26.8 fJ/bit, respectively.
2026
20VLSI_2026_38
Design for Slew-Rate in Multi-Stage CMOS OTAs
Cascading gain stages in CMOS Operational Transconductance Amplifiers (OTAs) has become a necessity in applications with high gain requirements, where the contribution of each stage to the overall gain is well-known and carefully designed. Many of these applications also impose requirements on speed, including a minimum Slew-Rate (SR) to ensure signal fidelity, however the impact of individual gain stages on the overall SR in multi-stage OTAs has been difficult to characterize–let alone carefully design. The difficulty arises due to the complexity of the compensation networks involved in these OTAs. This paper presents a systematic design approach for achieving a target SR in multi-stage CMOS OTAs, enabled through the utility of a novel analytical model for estimating the lower-bound Slew-Rate in multi-stage OTAs. The model evaluates individual currents and equivalent capacitances at the output node of each stage, providing insights on the dominant node slowing down the overall SR. For generality, the model establishes the SR analysis based on N-stage designs, and considers widely employed compensation networks. Example designs, with post-layout simulations and measurements of a 3-and a 4-stage CMOS OTA, and with post-layout simulations of a 5-stage CMOS OTA, are presented for validating the model’s utility. The results show strong agreement between theoretical, simulated, and measured SR values, confirming the model’s reliability in estimating the lower-bound SR, and its utility in a systematic design-for-SR approach in multi-stage CMOS OTAs.
2026
21VLSI_2026_46
A 1.5-GS/s 7-bit Charge-Injection SAR ADC Using a PVT-Tracking 1-bit Metastability Detector
This brief presents a 7-bit 1.5GS/s area-efficient asynchronous Charge Injection Successive Approximation Regis-ter (CI-SAR) analog-to-digital converter (ADC). The proposed ADC architecture consists of a 6-bit CI-SAR and a 1-bit Metastability Detector (MD), forming a 7-bit ADC. Superior area efficiency is achieved by architecting the CI-DAC in a segmented structure. The Charge Injection Cell (CIC) is biased by a temperature-aware bias generator that maintains the ADC full-scale over a wide temperature range. The gm -boosted strongARM comparator helps achieve high sampling speed up to 1.5GS/s. A background-calibrated metastability-detector extracts an addi-tional bit under different process, voltage, and temperature (PVT) conditions. The proposed ADC is fabricated in 28nm CMOS process in an area of 202µm2 . The peak SNDR is 39.04dB with FoMw = 17.4fJ/conv.-step. The measured SNDR drops by less than 2.7dB across the −40◦ C to 80◦ C temperature variation and 0.9V to 1.1V supply voltage variation.
2026
22VLSI_2026_49
An 8-bit Precision 10T SRAM Compute-in-Memory Macro Using ADC With Small Area
This brief proposes a charge-domain analog Compute-In-Memory (CIM) architecture based on multi-bit SRAM. The proposed structure consists of a 64 × 64 10T1C SRAM array, which can perform 1024 MAC operations between 4-bit signed input and weight within a single clock cycle. An 8-bit Analog-to-Digital Converter (ADC) is employed for quantization, converting the analog Multiply-Accumulate (MAC) results into 8-bit digital signals for output. The ADC module adopts capacitor array multiplexing, pseudo-differential sampling with double-terminal flipping method and matching layout designing to save area. The proposed circuit is implemented in 28nm process which operates at a supply voltage of 0.8V. It achieves an energy efficiency of 184 TOPS/W and an area efficiency of 7.3 TOPS/mm2 , and reaches an accuracy of 86.9% in the training on CIFAR-10 dataset. When compared with other works, the highlight of this work is the highest energy efficiency and the highest area efficiency on 4-bit input and weight precision normalized to 28nm.
2026
23VLSI_2026_67
Energy-Efficient Izhikevich Neuron Design Using Approximate CORDIC-Based Multipliers for Low-Power Neuromorphic Hardware
The Izhikevich neuron model is a widely used spiking neuron model that combines biologically plausible behavior with computational efficiency. As hardware implementations often suffer from high power and area usage due to multiplication operations, CORDIC-based multipliers have been a promising solution. However, as the energy efficiency of CORDIC-based multipliers is still limited, further improvement is needed. This paper proposes an approximate-adder-based CORDIC approach for implementing the Izhikevich neuron, replacing multipliers with shift-add operations and exact adders with approximate ones to reduce resource usage. Simulation and ASIC implementation results demonstrate significant improvements in energy efficiency with 14.56% increase in throughput and 10.23% area reduction, while compromising the neuron dynamics by only 0.07% in Mean Relative Error (MRE). In simulation, the membrane potential traces of the proposed model closely match those of the standard HOMIN model across all neuron behaviors, and the optimal configuration of 9 CORDIC iterations achieves the best trade-off between accuracy and efficiency.
2026
24VLSI_2026_69
A Counter-Based Addition Circuit Design for Stochastic Computing
Stochastic computing (SC) encodes real values via probabilistic bitstreams, enabling complex arithmetic operations to be realized by simple logic gates. However, the requirement of longer bitstreams to ensure computing accuracy leads to higher latency, partially offsetting the low-complexity advantage of SC. To address this, this work utilizes a dynamic truncation method for stochastic bitstreams, and designs an energy-efficient counter-based addition circuit (CBAC) through effective bit recognition and correlation. Further, a tree-structured cascading architecture is then used to perform multi-input addition computing. Experi-mental results demonstrate that the proposed CBAC outperforms the state-of-the-art designs. For instance, a 16-input configuration achieves at least 75.9% reduction in mean square error (MSE) and a more than 43.1% reduction in area. When applied to poly-nomial computation and Gaussian filtering, the proposed archi-tecture exhibits superior accuracy and efficiency, delivering MSE reductions of at least 10.7% and area reductions exceeding 6.8%.
2026
25VLSI_2026_70
An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference
This letter presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to minimize accuracy loss in real-time. The macro achieves an energy efficiency of 60.35 TOPS/W and an area efficiency of 1105 GOPS/mm2 for INT8 MACs, outperforming prior works. It attains negligible accuracy degradation on multiple mainstream datasets and suits well for edge AI inference.
2026
26VLSI_2026_74
High-Speed Floating Voltage Level Shifter With Self-Locked Noise Immunity for High-Side Gate Driver
Level shifters (LSs) are essential circuit elements within high-side gate drivers, responsible for translating logic signals from a low-voltage (LV) domain to a high-voltage (HV) domain. However, high-frequency gate drivers impose stringent requirements on the propagation delay and reliability of LSs. This article presents a novel high-speed floating voltage LS that achieves subnanosecond propagation delays and enhanced noise immunity. The proposed structure, termed the common-mode self-locking (CMSL) LS, mitigates both common-mode (CM) and differential-mode (DM) noise. The proposed architecture incorporates two complementary techniques: a cross-coupled current mirror topology to suppress CM transient noise, and a combination of resistive signal sensing and a self-locking (SL) mechanism to provide DM transient noise immunity. The proposed technique achieves subnanosecond propagation delay while ensuring high noise immunity and maintaining excellent tolerance to negative voltage swings on the floating supply rail. The design is implemented in a 0.18-µm bipolar CMOS–DMOS on the silicon-on-insulator (BCD-on-SOI) process. Simulation results demonstrate that the final LS achieves an average prop-agation delay as low as 410 ps. The figure of merit (FOM) is 0.057 ns/(µm·V), indicating superior performance compared to existing solutions. The experimental results validate the LS’s performance, demonstrating an overall propagation delay of less than 19 ns and a dV/dt noise immunity of up to 42.8 V/ns. The design supports a negative VS swing of −4.12 V at a 5-V supply voltage. Furthermore, verification in a 600-V BCD process confirms the architecture’s intrinsic robustness and superior immunity performance suitable for HV applications.
2026
27VLSI_2026_75
A T8T-SRAM Computing-in-Memory Macro for Ternary Deep Neural Networks and Boolean Logic Computations
Level shifters (LSs) are essential circuit elements within high-side gate drivers, responsible for translating logic signals from a low-voltage (LV) domain to a high-voltage (HV) domain. However, high-frequency gate drivers impose stringent requirements on the propagation delay and reliability of LSs. This article presents a novel high-speed floating voltage LS that achieves subnanosecond propagation delays and enhanced noise immunity. The proposed structure, termed the common-mode self-locking (CMSL) LS, mitigates both common-mode (CM) and differential-mode (DM) noise. The proposed architecture incorporates two complementary techniques: a cross-coupled current mirror topology to suppress CM transient noise, and a combination of resistive signal sensing and a self-locking (SL) mechanism to provide DM transient noise immunity. The proposed technique achieves subnanosecond propagation delay while ensuring high noise immunity and maintaining excellent tolerance to negative voltage swings on the floating supply rail. The design is implemented in a 0.18-µm bipolar CMOS–DMOS on the silicon-on-insulator (BCD-on-SOI) process. Simulation results demonstrate that the final LS achieves an average prop-agation delay as low as 410 ps. The figure of merit (FOM) is 0.057 ns/(µm·V), indicating superior performance compared to existing solutions. The experimental results validate the LS’s performance, demonstrating an overall propagation delay of less than 19 ns and a dV/dt noise immunity of up to 42.8 V/ns. The design supports a negative VS swing of −4.12 V at a 5-V supply voltage. Furthermore, verification in a 600-V BCD process confirms the architecture’s intrinsic robustness and superior immunity performance suitable for HV applications.
2026
28VLSI_2026_77
An Independently Programmable Multioutput SAR SC DC-DC Converter
This article presents an integrated independently programmable multioutput (MO) successive approximation (SAR) switched-capacitor (SC) DC-DC converter with two independently programmable output channels ranging outputs between 0.4 V and VBAT with resolution of VBAT /3 N where VBAT is the input voltage and N is the number of SC units cascaded in SAR structure. A modified dual-output (DO) SC DC-DC converter unit cell is proposed with reduced ripple at its output and is integrated in the SAR-SC structure to deliver a wide input and output voltage range. The modified DO-SC units are cascaded via switch boxes to yield a fine-grain voltage resolution. The 1357 ordered pairs of conversion ratios (CRs) ensure that the two output channels are orthogonal and have near-zero crosstalk under varying load conditions. The proposed converter is fabricated in 180 nm CMOS process and achieves a peak efficiency of 94% across a combined load variation of 10 to 800 µA.
2026
29VLSI_2026_78
A 28 nm 1.3 TFLOPS/mm2 Floating-Point SRAM-Based CIM Macro With Asynchronous Normalization and Parallel Sorting Alignment for AI-Edge Chip
State-of-the-art AI edge devices require floating-point (FP) multiply-accumulate (MAC) operations with high-energy efficiency and inference accuracy. FP computing in-memory (FP-CIM) has a broader range of applications compared to integer CIM. However, FP-CIM can incur greater power, delay, and area overheads than integer CIM due to the inherent complexity of FP computational flow. In this article, we introduce a new method for asynchronous exponent normalization and parallel mantissa alignment. This approach allows us to add expo-nents and find the maximum sum simultaneously. We also replace the traditional subtraction and shifting for mantissa align-ment with a cross-structure maximum-finding method, enabling FP-CIM to be achieved with lower delay, area, and power overheads. The macro is designed in TSMC 28 nm process, with a memory size of 6 Kb, a layout area of 0.067 mm2 , and an area efficiency of 1.3 TFLOPS/mm2 . Simulation results show that the macro computational frequency and energy efficiency can reach 150 MHz and 12.8 TFLOPS/W, respectively, at 900 mV, while performing FP-MAC operations.
2026
30VLSI_2026_79
A 40 nm Buffer-Free 7T-SRAM Analog Charge-Domain CIM Macro With Merging Timing Based On Time-Row Division Strategy
Computing-in-memory (CIM) macros based on static random access memory (SRAM) are meant to increase capacity while improving energy efficiency and reducing com-puting latency. However, traditional analog designs still face several key challenges, including long computing latency from separated computing phases, negative voltage fluctuations from massive parallel computing, and low bitcell density from addi-tional transistors and capacitors for multiplication. On the other hand, only time-aligned inputs are supported in the works. To overcome the above challenges, this work proposes a buffer-free 7T-SRAM charge-domain CIM macro. It has four key features: 1) a compact 7T SRAM bitcell structure for high-energy efficiency; 2) a configurable input unit to support different sizes of input activations; 3) a time-row division (RD) strategy to support real-time processing and alleviate negative voltage fluctuations; and 4) a merging timing to conceal the input phase for high throughput. The fabricated 512-Kb SRAM-CIM macro in 40 nm achieves 79.3–290.4 Tops/W at 4-bit precision.
2026
31VLSI_2026_83
A Low Area Built-In Self-Repair Using Hybrid Fault Address Memory for HBM
The massive computational requirements of large language model (LLMs) have increased the need for high-bandwidth memory (HBM), which involves high-volume data transfers. The high cell capacity of HBM results in extended test and repair times, leading to increased manufacturing costs. To reduce test time, a built-in self-repair (BISR) circuit, integrated into the HBM base die to detect and repair faults, tests multiple banks in parallel. Conventional BISR approaches adopt content-addressable memory (CAM) for fault classification to reduce repair time. However, dedicated CAM on each bank leads to substantial area overhead associated with its comparison logic. To address these issues, a novel BISR architecture that decouples fault classification and storage is proposed in this article. By introducing a linked CAM design with low area and sharing it across banks for fault classification, while small-area first-in first-out (FIFO) memories allocated to each bank store the classified fault information, the proposed architecture substantially reduces overall area overhead. Furthermore, the proposed architecture reorders the repair solution search sequence toward the most promising candidates by swapping fault entries during test idle periods, thereby significantly reducing repair time. Experimental results demonstrate that the proposed BISR architecture achieves low area overhead and fast repair time for high-density HBM.
2026

AREA EFFICIENT/ TIMING & DELAY REDUCTION

S.No Code Title Year
1 VLSI_2026_06_AE
VASE: Vector Memory Using Bit-Level Address Segmentation for High-Speed Memory Testing
Abstract: An advanced vector memory architecture using bit-level address segmentation is proposed for high-speed memory testing. The design improves test pattern generation by cyclically reusing stored vectors while minimizing memory capacity requirements. This approach enables faster testing, reduced hardware complexity, and enhanced fault coverage for modern memory systems. Experimental results demonstrate improved testing efficiency and reduced implementation overhead.
2026
2 VLSI_2026_08_AE
DiP: A Scalable, Energy-Efficient Systolic Array for Matrix Multiplication Acceleration
Abstract: DiP presents a scalable systolic array architecture optimized for matrix multiplication acceleration. The proposed design reduces communication overhead between processing elements while maximizing data reuse. Experimental evaluation shows significant improvements in throughput and energy efficiency, making it highly suitable for AI inference and large-scale machine learning workloads.
2026
3 VLSI_2026_09_AE
Pipe-LPAQ: A Fully Pipelined Architecture for LPAQ Data Compression on FPGA
Abstract: Pipe-LPAQ introduces a fully pipelined FPGA architecture for LPAQ data compression. The design increases throughput by parallelizing compression stages and eliminating processing bottlenecks. FPGA implementation demonstrates high compression speed, low resource utilization, and efficient performance for storage and communication applications.
2026
4 VLSI_2026_17_AE
Countering Side-Channel Attacks With a Dynamic S-Box Based on Affine Transformations and Gold Sequences
Abstract: This paper proposes a dynamic S-box design based on affine transformations and Gold sequences to resist side-channel attacks. The approach generates varying substitution patterns during operation, significantly reducing vulnerability to differential power analysis while maintaining low hardware overhead and preserving cryptographic efficiency.
2026
5 VLSI_2026_19_AE
High Throughput and Compact FPGA TRNGs Based on Hybrid Entropy, Reinforcement Strategies, and Automated Exploration
Abstract: A compact FPGA-based true random number generator using hybrid entropy sources is presented. Reinforcement strategies and automated exploration optimize entropy stability and throughput. Experimental results show improved randomness quality, reduced resource usage, and high-speed operation suitable for secure embedded cryptographic systems.
2026
6 VLSI_2026_24_AE
EVMx: An FPGA-Based Accelerator for Smart Contract Processing
Abstract: EVMx presents an FPGA-based accelerator designed to enhance smart contract execution by optimizing Ethereum Virtual Machine operations. The architecture reduces execution latency through parallelized instruction processing and hardware-level optimization, significantly improving throughput and energy efficiency for blockchain applications.
2026
7 VLSI_2026_25_AE
Design Space Exploration of a Unified FPGA Accelerator for Elliptic-Curve-Based Functions in Attribute-Based Encryption
Abstract: This paper explores the design space of a unified FPGA accelerator supporting elliptic-curve cryptographic operations for attribute-based encryption. The architecture balances resource utilization, computational efficiency, and flexibility, achieving significant acceleration for secure cryptographic processing while maintaining low hardware overhead.
2026
8 VLSI_2026_26_AE
Exa: A Unified Architecture for Multi-Scalar Multiplication and Polynomial Computation in Zero-Knowledge Proof
Abstract: Exa introduces a unified hardware architecture for accelerating multi-scalar multiplication and polynomial computation in zero-knowledge proof systems. The proposed design improves computational throughput and reduces latency by efficiently sharing arithmetic resources, enabling scalable acceleration for privacy-preserving cryptographic protocols.
2026
9 VLSI_2026_27_AE
A RISC-V Accelerator for Sequence Decoding in Mobile DNA Sequencers
Abstract: This work presents a RISC-V-based hardware accelerator for sequence decoding in portable DNA sequencing systems. The architecture improves decoding efficiency through instruction-level acceleration and optimized data handling, enabling faster genomic analysis while maintaining low power consumption for mobile biomedical applications.
2026
10 VLSI_2026_29_AE
Area-Time Efficient Formula-Based BCH Decoder With Trace Mechanism for WBAN Applications
Abstract: An area-time efficient BCH decoder architecture is proposed for wireless body area network applications. By incorporating a trace-based decoding mechanism and formula-driven error correction, the design achieves reduced hardware complexity, lower latency, and improved decoding reliability for energy-constrained medical communication systems.
2026
11 VLSI_2026_30_AE
An LSGQ-FFS Framework for Adaptive Optimization of Hybrid INT-CIM Architecture
Abstract: This paper presents an LSGQ-FFS framework for adaptive optimization of hybrid integer computing-in-memory architectures. The proposed framework dynamically adjusts hardware configurations to improve computational accuracy, efficiency, and power optimization while reducing latency in data-intensive processing tasks.
2026
12 VLSI_2026_40_AE
Construction of Lightweight S-Boxes With Low Boomerang Uniformity
Abstract: A lightweight S-box construction methodology is proposed to achieve low boomerang uniformity for enhanced cryptographic resistance. The design improves security against advanced cryptanalysis while maintaining compact hardware implementation suitable for lightweight secure embedded systems.
2026
13 VLSI_2026_41_AE
Modular, Low-Cost Bus and ECC Encoders for Memory Macros Under Maximal Power Constraints
Abstract: This work introduces modular and low-cost bus and ECC encoders designed for memory macros operating under strict power constraints. The architecture minimizes encoding complexity while ensuring reliable error correction and efficient communication in low-power memory subsystems.
2026
14 VLSI_2026_42_AE
A Scalable Segment-Parallel Architecture for High-Efficiency Lossless Data Compression
Abstract: A scalable segment-parallel architecture is proposed for high-efficiency lossless data compression. The design enables parallel processing of data segments to improve throughput and compression performance while maintaining low resource consumption across FPGA implementations.
2026
15 VLSI_2026_44_AE
A High-Speed FPGA Implementation for IVF-PQ Index Construction
Abstract: This paper presents a high-speed FPGA implementation for inverted file product quantization index construction. The proposed architecture accelerates nearest-neighbor search preprocessing with optimized parallelism, achieving lower latency and improved throughput for large-scale vector database applications.
2026
16 VLSI_2026_45_AE
TRIM: Acceleration of Multiplication-Less Neural Networks via Versatile Sparsities
Abstract: TRIM proposes a hardware acceleration framework for multiplication-less neural networks using versatile sparsity patterns. The architecture improves computational efficiency, reduces memory overhead, and enables low-power AI inference while maintaining competitive accuracy.
2026
17 VLSI_2026_47_AE
A Seamless Mode Transition Scheme With DCM Compensation for AOT Buck Converter
Abstract: A seamless mode transition scheme with discontinuous conduction mode compensation is introduced for asynchronous-on-time buck converters. The design reduces voltage overshoot and improves transient response, ensuring stable and efficient power conversion.
2026
18 VLSI_2026_51_AE
A Counter-Based Addition Circuit Design for Stochastic Computing
Abstract: This paper presents a counter-based addition circuit for stochastic computing. The architecture reduces hardware complexity and improves arithmetic precision while enabling efficient implementation of probabilistic computing systems.
2026
19 VLSI_2026_53_AE
Hierarchical Approximate Min-Sum and Multi-Frame Parallel QC-LDPC Decoder for FPGA Optical Links
Abstract: A hierarchical approximate min-sum decoding architecture is proposed for QC-LDPC codes in FPGA optical links. The multi-frame parallel decoder achieves low latency, high throughput, and efficient error correction for high-speed optical communication systems.
2026
20 VLSI_2026_55_AE
FPGA-Based High-Speed Gray-Code Phase Shift Profilometry System
Abstract: This work presents an FPGA-based high-speed Gray-code phase shift profilometry system for real-time 3D surface reconstruction. The architecture accelerates phase decoding and image processing, enabling accurate and efficient depth measurement.
2026
21 VLSI_2026_59_AE
A Resource Reuse Strategy for Large-Scale Matrix Operations in HLS-Based FPGA Design
Abstract: A resource reuse strategy is proposed for large-scale matrix operations in high-level synthesis FPGA design. The approach improves hardware utilization and reduces resource overhead while maintaining computational performance for matrix-intensive applications.
2026
22 VLSI_2026_64_AE
Two Novel Approximate Radix-4 Booth Encoders for Efficient Signed Approximate Booth Multipliers
Abstract: This paper introduces two approximate radix-4 Booth encoders for signed multiplication. The proposed encoders reduce circuit complexity, power consumption, and delay while maintaining acceptable computational accuracy for error-tolerant signal processing applications.
2026
23 VLSI_2026_66_AE
Efficient Approximate Ternary Multipliers for Emerging Nanodevices
Abstract: Efficient approximate ternary multiplier architectures are proposed for emerging nanodevice technologies. The designs achieve reduced area and energy consumption while maintaining sufficient accuracy for low-power arithmetic-intensive applications.
2026
24 VLSI_2026_68_AE
Toward Efficient Logarithmic Converter Circuit Design via Constraint-Driven Parameter Exploration
Abstract: This work proposes a constraint-driven exploration framework for designing efficient logarithmic converter circuits. The methodology optimizes design parameters to improve conversion precision, reduce hardware overhead, and enhance performance across digital arithmetic systems.
2026
25 VLSI_2026_71_AE
Big Integer Parallel Stream Modular Multiplier With Variable Bit-Widths
Abstract: A parallel stream modular multiplier supporting variable bit-width big integer arithmetic is presented. The architecture improves throughput and flexibility for cryptographic applications while minimizing latency and hardware resource consumption.
2026
26 VLSI_2026_72_AE
Fusing Adds and Shifts for Efficient Dot Products
Abstract: This paper proposes an efficient hardware technique that fuses addition and shift operations for dot product computation. The design reduces computational latency and hardware complexity, enabling faster arithmetic processing for AI and DSP workloads.
2026
27 VLSI_2026_73_AE
HFMLLR: Heterogeneous Feature Mining for Low-Overhead Latency Reduction Scheme of LDPC Codes in 3-D TLC NAND Flash Memory
Abstract: HFMLLR introduces a heterogeneous feature mining framework for reducing LDPC decoding latency in 3-D TLC NAND flash memory. The approach improves error correction efficiency while minimizing computational overhead and enhancing storage reliability.
2026
28 VLSI_2026_81_AE
An Area-Efficient and Reconfigurable Accelerator for Massive MIMO Systems
Abstract: An area-efficient and reconfigurable accelerator for massive MIMO communication systems is proposed. The architecture supports flexible processing configurations while reducing silicon area and improving throughput for next-generation wireless communication.
2026
29 VLSI_2026_84_AE
FPGA-Based Low-Power Signed Approximate Multipliers for Diverse Error-Resilient Applications
Abstract: This work presents low-power signed approximate multipliers implemented on FPGA. The designs achieve reduced power consumption and hardware cost while preserving acceptable accuracy for multimedia and error-resilient processing applications.
2026
30 VLSI_2026_85_AE
Analysis of a Delay-Element-Based Technique for Enhancing Soft Error Tolerance at Input Nodes Around Clock Edges
Abstract: This paper analyzes a delay-element-based technique to improve soft error tolerance at input nodes near clock edges. The proposed method enhances system reliability by mitigating transient fault sensitivity in synchronous digital circuits.
2026
31 VLSI_2026_86_AE
Entropy Model of FIRO-Based TRNGs With Differential Structure
Abstract: An entropy model for FIRO-based true random number generators with differential structure is presented. The analysis improves randomness characterization and enables more reliable hardware random number generation for secure cryptographic applications.
2026
32 VLSI_2026_87_AE
An Efficient Approximate Radix-8 Booth Multiplier for Edge Detection in Bioimages by Field Programmable Gate Array
Abstract: This paper presents an approximate radix-8 Booth multiplier optimized for FPGA-based edge detection in bioimages. The architecture improves computational speed and reduces power consumption while maintaining effective image processing accuracy.
2026
33 VLSI_2026_89_AE
Efficient Multiplierless FPGA Architecture for Brain-Inspired Rulkov Neuron Mapping
Abstract: An efficient multiplierless FPGA architecture is proposed for implementing Rulkov neuron models. The design reduces hardware complexity and power consumption while enabling high-speed brain-inspired neural computation.
2026
34 VLSI_2026_90_AE
A Phase-Walk-Based True Random Number Generator Exploiting Dual-Ring Phase Jitter Comparison
Abstract: This work introduces a phase-walk-based true random number generator exploiting dual-ring oscillator phase jitter comparison. The architecture achieves high entropy generation, strong randomness quality, and efficient hardware implementation for secure systems.
2026

HIGH SPEED DATA TRANSMISSION & Networking

S.No Code Title Year
1VLSI_2026_01_HS
FPGA-Based Real-Time ECG Classification System Using Quantized Inception-ResNeXt Neural Network and CWT Approximation
Real-time ECG classification system using quantized deep learning model optimized for FPGA deployment. Uses CWT approximation for feature extraction and low-power inference.
2026
2VLSI_2026_28_HS
An Efficient VLSI Architecture for Hammerstein-Type Spline Adaptive Filters
Proposes hardware-efficient adaptive filter architecture using Hammerstein nonlinear modeling with spline-based optimization for real-time signal processing.
2026
3VLSI_2026_39_HS
A High-Energy-Efficiency Lightweight BNN Accelerator for Arrhythmia Detection
Implements binarized neural network accelerator optimized for wearable arrhythmia detection with ultra-low energy consumption and high throughput.
2026
4VLSI_2026_48_HS
An Area-Efficient Normal Input/Output Ordered Memory-Based FFT Using an SC Kernel
Designs memory-optimized FFT architecture with structured computation kernel for reduced area and improved pipeline efficiency.
2026
5VLSI_2026_50_HS
An IR-UWB Transmitter Using Two-Dimensional Differential Pulse Position Modulation
Presents IR-UWB transmitter using 2D-DPPM modulation for ultra-wideband low-power wireless communication systems.
2026
6VLSI_2026_52_HS
FPGA Implementation of a Real-Time Parallel Loop-Unrolled DFE for PAM-4 IM/DD Optical Links
Implements high-speed decision feedback equalizer optimized for PAM-4 optical communication links using loop unrolling for throughput improvement.
2026
7VLSI_2026_54_HS
EVMx: An FPGA-Based Accelerator for Smart Contract Processing
Accelerates smart contract execution using FPGA-based EVM architecture for blockchain scalability and reduced latency.
2026
8VLSI_2026_56_HS
A 32-Channel, Low Resources and High-Precision FPGA Time-to-Digital Converter (TDC) Based on Dynamic Phase Shifting (DPS) for 3-D LiDAR Applications
Presents multi-channel TDC architecture using dynamic phase shifting for high-resolution LiDAR depth sensing.
2026
9VLSI_2026_61_HS
Scalable Network-on-Chip Design for FPGA Implementation
Proposes scalable NoC architecture optimized for FPGA-based multi-core communication systems with low congestion routing.
2026
10VLSI_2026_62_HS
Single-Stage Flip PAM-8 Signal Transmission for W-Band Wireless System
Implements high-frequency PAM-8 modulation scheme for W-band wireless communication with improved spectral efficiency.
2026
11VLSI_2026_65_HS
Analytical Error Evaluation and Hardware Implementation of Approximate Negation Circuits
Studies error behavior of approximate negation circuits and presents efficient hardware implementation for error-tolerant computing.
2026
12VLSI_2026_76_HS
Hardware-Accelerated ASIC and Cardiac Monitoring System for Wearable Devices
Designs ASIC-based cardiac monitoring system optimized for wearable healthcare devices with real-time processing capability.
2026
13VLSI_2026_80_HS
HSA: An Efficient Sparse CNN Accelerator Based on Kernel-Aware Hybrid Pruning
Proposes sparse CNN accelerator using hybrid pruning techniques for reducing computation cost in deep learning inference.
2026
14VLSI_2026_82_HS
A 48-Gb/s PAM-4 Transceiver With Transition Boosting and RLM Calibration for Next-Generation Memory Interface Testing
High-speed PAM-4 transceiver with calibration techniques for next-gen memory interface testing and signal integrity improvement.
2026
15VLSI_2026_88_HS
Noise Modulation of a Bandgap Reference by Using a Single Resistor
Introduces noise modulation technique in bandgap reference circuits using minimal component design for improved stability.
2026

VLSI Design of Image, Video and Audio Processing

S.No Code Title Year
1VLSI_2026_02_IM
An Efficient Accelerator for Dehazing Neural Network Based on Physical Perception Model and Cross-Scale Pixel Attention
Implements an FPGA-efficient dehazing accelerator using physical imaging priors and cross-scale attention. Improves real-time image clarity in low-visibility environments.
2026
2VLSI_2026_07_IM
EDCSSM: Edge Detection With Convolutional State Space Model
Proposes edge detection using convolutional state space models for better spatial feature extraction and reduced computational complexity.
2026
3VLSI_2026_11_IM
Event-Triggered Multi-Kernel Learning-Based Stochastic MPC With Applications in Building Climate Control
Introduces event-triggered MPC with multi-kernel learning for intelligent building climate optimization and energy savings.
2026
4VLSI_2026_21_IM
An Energy-Efficient Kalman Filter Coprocessor Design for Multiple-Object Tracking Targeting at Video Understanding
Hardware Kalman filter coprocessor optimized for multi-object tracking with reduced power consumption for video analytics systems.
2026
5VLSI_2026_23_IM
AHCO-YOLO: An Algorithm–Hardware Co-Optimization Framework for Energy-Efficient and Real-Time Object Detection on Edge Devices
Co-optimized YOLO-based object detection framework designed for real-time edge inference with energy-efficient hardware mapping.
2026
6VLSI_2026_43_IM
TEA-SPS: A Tiny and Efficient Architecture for Softmax With Parallelism and Sparsity Adaptability
Designs compact softmax accelerator using parallel sparse computation techniques for efficient neural network inference.
2026
7VLSI_2026_58_IM
Pipe-LPAQ: A Fully Pipelined Architecture for LPAQ Data Compression on FPGA
Implements pipelined LPAQ compression architecture optimized for FPGA throughput and low latency streaming data compression.
2026
8VLSI_2026_60_IM
FPGA-Based Medical Image Processing Using Hardware-Software Co-Design Approach
Presents co-design framework for medical image processing using FPGA acceleration with optimized hardware-software partitioning.
2026
9VLSI_2026_63_IM
Power Efficient Multiplier Design for Error Resilient Edge Applications
Develops low-power multiplier architecture designed for error-resilient edge computing applications with optimized energy efficiency.
2026

+91 9566475911 | +91 9962588976 | projects@lemenizinfotech.com