LLVM-IR Instruction Latency Estimation Using Deep Neural Networks for a Software–Hardware Interface for Multi-Many-Cores


  • Hiro Mikami Graduate School of Informatics, Nagoya University, Japan
  • Seira Iwai Graduate School of Informatics, Nagoya University, Japan
  • Masato Edahiro Graduate School of Informatics, Nagoya University, Japan




neural network, estimation, multicore, embedded system, SHIM


This study  presents a method for estimating the latency of each LLVM-IR instruction to enable effective parallelization in model-based development.  In recent embedded systems, such as in-vehicle electronic control, multi-many-core processors are utilized for the hardware, and model-based development for software.  In the design of these systems, the degree of parallelism in the software and accuracy of performance estimation in the early design stages of the model-based development can be improved by estimating the performance of the blocks in the models and utilizing the estimate for parallelization.  Research is therefore being performed on a software performance estimation technique that uses IEEE2804-2019 hardware feature description called Software-Hardware Interface for Multi-many-core (SHIM).  In SHIM, each LLVM-IR instruction is associated with an execution cycle of the target processor.  Several types of assembly instruction sequences are generated for the target processor from a given LLVM-IR instruction; thus, it is not easy to estimate the number of execution cycles.  In this study, we propose a method that uses deep neural networks to estimate execution cycles for each LLVM-IR instruction.  It can be observed that our method obtains a better estimation of LLVM-IR instruction latency compared with previous methods in experiments using the Raspberry Pi3 Model B+.


Download data is not yet available.


ARM Limited. "ARM CoreSight ETM-R5 Technical Reference Manual r0p0," (Referred 2023-06-25).


EMC, "The Multicore Association Specifications", (Accessed 2023-06-25). https://www.embeddedmulticore.org/the-


eSOL Co. Ltd., "eMBP (Model Based Parallelizer)," (Accessed 2023-06-25).


Gondo, Masaki, Fumio Arakawa and Masato Edahiro, "Establishing a standard interface between multi-manycore

and software tools - SHIM", COOL Chips XVII, VI-1, 2014.

Hwang, Yonghyun, Samar Abdi, and Daniel Gajski. "Cycle-approximate retargetable performance estimation at

the transaction level." Proceedings of the Conference on Design, Automation and Test in Europe. 2008.

IEEE, "IEEE Standard for Software-Hardware Interface for Multi-Many-Core", IEEE 2804-2019, (Accessed

-06-25). https://standards.ieee.org/ieee/2804/7477/.

Kasahara, Hironori, Honda, Hiroki, Mogi, A., Ogura, A., Fujiwara, F., Na rita, Seinosuke. “A multi-grain

parallelizing compilation scheme for OSCAR (optimally scheduled advanced multiprocessor).” International

Workshop on Languages and Compilers for Parallel Computing, Springer, Berlin, Heidelberg, pp.283-297, 1991.

LLVM project, "The LLVM Compiler Infrastructure",(Accessed 2023-06-25). https://llvm.org/.

LLVM project, "LLVM Language Reference Manual", (Accessed 2023-06-25). https://llvm.org/docs/LangRef.html.

LLVM project, "llvm-cov", (Accessed 2023-06-25). https://llvm.org/docs/CommandGuide/llvm-cov.html.

Mikami, Hiro, Kei Torigoe, Makoto Inokawa, and Masato Edahiro. "LLVM Instruction Latency Measurement for

Software-Hardware Interface for Multi-many-core." International Journal of Computers & Technology, 22 (2022):

–63. https://doi.org/10.24297/ijct.v22i.9231

Patel, Rajendra, and Arvind Rajawat. "Recent trends in embedded system software performance estimation."

Design Automation for Embedded Systems 17.1 (2013): 193-213.

Powell, Daniel Christopher, and Björn Franke. "Using continuous statistical machine learning to enable high-speed

performance prediction in hybrid instruction-/cycle-accurate instruction set simulators." Proceedings of the 7th

IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis. 2009.

Ray, Abhijit, Thambipillai Srikanthan, and Jigang Wu. "Rapid techniques for performance estimation of processors."

Journal of Research and Practice in Information Technology 42.2 (2010): 147-165.

Renesas electronics, "RH850/E1M-S2", (Accessed 2023-06-25). https://www.renesas.com/jp/en/products/microcontrollers-


SHIM Working Group, "SHIM Latency Measurement and Insertion", (Accessed 2023-06-25).


Wijesundera, Deshya, et al. "Framework for rapid performance estimation of embedded soft core processors."

ACM Transactions on Reconfigurable Technology and Systems (TRETS) 11.2 (2018): 1-21.




How to Cite

Mikami, H., Iwai, S., & Edahiro, M. (2023). LLVM-IR Instruction Latency Estimation Using Deep Neural Networks for a Software–Hardware Interface for Multi-Many-Cores. INTERNATIONAL JOURNAL OF COMPUTERS &Amp; TECHNOLOGY, 23, 49–79. https://doi.org/10.24297/ijct.v23i.9472



Research Articles