TOHOKU UNIVERSITY Researchers - Hiroyuki Takizawa

Details of the Researcher

Home

日本語 English

Hiroyuki Takizawa

Section

Cyberscience Center

Job title

Professor

Degree

Doctor in Information Science (Tohoku University)

researchmap

https://researchmap.jp/h_takizawa

J-GLOBAL ID

200901079984691878

e-Rad No.

70323996

Research History 7

2024/04 - Present

Tohoku University
2019/04 - Present

Tohoku University　Cyberscience Center　Vice Director
2017/01 - Present

Tohoku University　Cyberscience Center　Professor
2009/01 - 2016/12

Associate Professor, Graduate School of Information Sciences, Tohoku University
2004/04 - 2008/12

Assistant Professor, Graduate School of Information Sciences, Tohoku University
2003/03 - 2004/03

Research Associate, Information Synergy Center, Tohoku University
1999/10 - 2003/02

Research Associate, Integrated Information Processing Center, Niigata University

Show all Show first 5

Committee Memberships 43

HPCIコンソーシアム　理事

2024/07 - Present
情報処理学会HPC研究会運営委員会　幹事(副主査)

2021/04 - Present
HPCI Cooperative Service Committee　Member

2021/03 - Present
International Workshop on Automatic Performance Tuning Program Committee　Program Committee Member

2009 - Present
COOL Chips Conference Program Committee　Program Committee Member

2007 - Present
HPC Asia 2026　General Chair

2025 - 2026
HPCI Cooperative Service Organizing and Working Group　Working Group Chair

2019/04 - 2021/03
情報処理学会HPC研究会運営委員会　運営委員

2015/04 - 2019/03
HPC Asia 2019　Program Committee Track co-chair

2018 - 2019
ACM/IEEE Supercomputing Conference, Tutorials Committee　Committee member

2017 - 2019
Legacy HPC Application Migration (LHAM)　Organizing Committee Member

2013 - 2018
Auto-Tuning for Multicore and GPU (ATMG)　Program Committee Member

2012 - 2018
情報処理学会システムアーキテクチャ研究会運営委員会　運営委員

2013/04 - 2017/03
情報処理学会ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS)プログラム委員会　委員

2008/10 - 2017/03
情報処理学会東北支部運営委員会　運営委員

2014/04 - 2016/03
情報処理学会Annual Meeting on Advanced Computing System and Infrastructure (ACSI) プログラム委員会　委員

2014/04 - 2015/03
情報処理学会論文誌コンピューティングシステム(ACS)編集委員会　ACS編集委員

2011/04 - 2015/03
情報処理学会東北支部庶務幹事　庶務幹事

2012/04 - 2014/03
情報処理学会先進的計算基盤システムシンポジウム(SACSIS) プログラム委員会　委員

2012/04 - 2014/03
情報処理学会東北支部庶務幹事　庶務幹事

2012/04 - 2014/03
情報処理学会先進的計算基盤システムシンポジウム(SACSIS) プログラム委員会　委員

2012/04 - 2014/03
情報処理学会東北支部広報幹事　広報幹事

2010/04 - 2012/03
サイエンティフィック・システム研究会　アクセラレータ技術ワーキンググループ委員

2009/09 - 2012/03
情報処理学会HPC研究会運営委員会　運営委員

2007/04 - 2011/03
電子情報通信学会コンピュータシステム研究専門委員会　委員

2005/04 - 2011/03
International Workshop on Automatic Performance Tuning (iWAPT)　Program chair

2025 -
HPCI連携サービス運営作業部会　委員

2024/10 -
ICPP2021 Program Committee　Member

2021 -
ACM/IEEE Supercomputing Conference 2020 (SC20)　Technical Program Committee Member

2020/11 -
International Workshop on Large-scale HPC Application Modernization (LHAM)　Program Committee Chair

2018 -
HPC Asia 2018　Program Committee Member

2018 -
HPC Asia 2018　Poster Chair

2018 -
情報処理学会ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS)プログラム委員会　プログラム委員長

2016/06 -
International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART)　Program Committee Member

2015/04 -
International Workshop on Software Engineering for Parallel Systems (SEPS)　Program Committee Member

2015 -
International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART)　Program Committee Member

2015 -
2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS)　Program Committee Member

2014/06 -
International Workshop on Hardware-Software Co-Design for High Performance Computing (Co-HPC)　Program Committee Member

2014 -
ACM/IEEE Supercomputing Conference 2013 (SC13)　Technical Program Committee Member

2013/11 -
Auto-Tuning for Multicore and GPU (ATMG)　Organizing Committee Chair

2013/09 -
Legacy HPC Application Migration (LHAM)　Organizing Committee Chair

2013 -
International Workshop on Automatic Performance Tuning　Organizing Committee Chair

2012 -
International Workshop on Peer-to-Peer Networking (P2PNet'10)　Program Committee Member

2010/12 -

Show all ︎Show first 5

Professional Memberships 4

The Institute of Electronics, Information, and Communication Engineers
Association for Computing Machinery (ACM)
The Institute of Electrical and Electronics Engineers (IEEE)
Information Processing Society of Japan

Research Interests 3

parallel and distributed processing
computer architecture
High-performance computing

Research Areas 5

Informatics / High-performance computing /
Informatics / Intelligent informatics /
Informatics / Information networks /
Informatics / Computer systems /
Informatics / Software /

Awards 22

Best Student Oral Presentation Award

2025/08
Outstanding Effort Award

2025/08
Outstanding Student Award

2025/08　Outstanding Student Award
SCA'25 Best Paper Award

2025/02　Supercomputing Asia 2025　Improving the Efficiency of a Deep Reinforcement Learning-Based Power Management System for HPC Clusters Using Curriculum Learning
Best Undergraduate Student Award

2024/08
IEEE Computer Society Japan Chapter xSIG Young Researcher Award

2024/08　IEEE Computer Society Japan Chapter
Best paper award at the 26th Workshop on Advances in Parallel and Distributed Computational Models

2024/05　Combining lossy compression with multi-level caching for data staging over network
Outstanding Effort Award

2023/07　The 7th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming
Best Workshop Paper Award

2020/11　International Symposium on Computing and Networking (CANDAR20)　Improving the Accuracy in SpMV Implementation Selection with Machine Learning
IEEE Computer Society Japan Chapter xSIG Young Researcher Award

2020/07　The 4th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming
HPC IN ASIA POSTER AWARD

2020/06　ISC High Performance Computing 2020　Challenges in Solving Scheduling Problems with the D-Wave Quantum Annealer
Best Poster Award at COOL Chips 22

2019/04　IEEE Symposium on Low-Power and High-Speed Chips　An Energy Optimization Method for Hybrid In-Memory Checkpointing
Best Paper Award

2018/12　The Second International Workshop on Automation in Machine Learning and Big Data (AutoML 2018)
Best Workshop Paper Award at CANDAR'18

2018/11　International Symposium on Computing and Networking (CANDAR)
Best Workshop Paper Award at CANDAR'15

2015/12/10　International Symposium on Computing and Networking (CANDAR)
Best Poster Award at COOL Chips XV

2012/04　IEEE Symposium on Low-Power and High-Speed Chips
Best Poster Award of HiPEAC '12

2012/01
The Poster Award

2011/01　Netxt-generation supercomputing symposium
Ishida Foundation Award

2009/10/30　石田記念財団
Nogushi Award

2008/05/28　情報処理学会東北支部
Funai Information Technology Encouragement Prize

2006/04/22　船井情報科学振興財団
ISPA'04 Best Paper Award

2004/12/14　ISPA2004

Show all ︎Show 5

Papers 316

Workflow Batch Job Scheduling with Considering Task Dependencies Peer-reviewed

Kaito Yanai, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Lecture Notes in Computer Science　123-144　2026/01/02
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-032-10507-3_7 　

ISSN： 0302-9743

eISSN： 1611-3349
CityScaleCast: Spatiotemporal GNN for City-Scale Weather Prediction with GraphCast-Guided Parallel Modeling and Multi-Step Forecasting in Sendai Peer-reviewed

Xuanwen Pan, Yoichi Shimomura, Sichen Tao, Masatoshi Kawai, Keichi Takahashi, Hiroyuki Takizawa

Workshop on Multi-scale, Multi-physics, Coupled Problems and AI enhanced simulations on HPC (MMCP'26)　2026/01
Explainable AI-Guided Genetic Algorithms for Efficient Software Automatic Tuning Peer-reviewed

Toshinobu Katayama, Masatoshi Kawai, Yoichi Shimomura, Keichi Takahashi, Hiroyuki Takizawa

Workshop on Multi-scale, Multi-physics, Coupled Problems and AI enhanced simulations on HPC (MMCP'26)　2026/01
Semantic Equivalence Verification of HPC Codes Using LLMs Peer-reviewed

Yuta Tanizawa, Masatoshi Kawai, Keichi Takahashi, Hiroyuki Takizawa

International Workshop on Foundational Large Language Models Advances for HPC in Asia　2026/01
Co-Design of a Power State-Aware Scheduler and an Intelligent Power Manager for Energy-Efficient HPC Systems Peer-reviewed

Raka Satya Prasasta, Santana Yuda Pradata, Kadek Gemilang Santiyuda, Muhammad Alfian Amrizal, Reza Pulungan, Hiroyuki Takizawa

Energy Efficient HPC State of the Practice Workshop 2026　2026/01
Deep Learning-Integrated Pairwise-Qubit Subsystems for Highly Efficient Quantum Circuit Simulation Peer-reviewed

Santana Yuda Pradata, Muhammad Alfian Amrizal, Wiwit Suryanto, Ahmad Ridwan, Tresna Nugraha, Hiroyuki Takizawa

Supercomputing Asia/HPC Asia 2026　2026/01
TRIOS: Reducing File-System Contention through Predictive Time-Resolved I/O Simulation in Job Scheduling Peer-reviewed

YuTsen Tseng, Masatoshi Kawai, Keichi Takahashi, Hiroyuki Takizawa

Supercomputing Asia/HPC Asia 2026　2026/01
Climate Change Effects on Probable Maximum Precipitation (PMP) of Mesoscale Convective Systems: Model-based Estimation and Large Ensemble-based Frequency Analysis Peer-reviewed

Yusuke Hiraga, Satoshi Watanabe, Takeshi Yamashita, Hiroyuki Takizawa

Journal of Hydrology　661　133724-133724　2025/11
Publisher: Elsevier BV
DOI： 10.1016/j.jhydrol.2025.133724 　

ISSN： 0022-1694
Developing an End-to-End 3D X-Ray Ptychography Workflow Using Surrogate Models Peer-reviewed

Ryota Koda, Keichi Takahashi, Hiroyuki Takizawa, Nozomu Ishiguro, Yukio Takahashi

Concurrency and Computation: Practice and Experience　37　(25-26)　2025/10/02
Publisher: Wiley
DOI： 10.1002/cpe.70308 　

ISSN： 1532-0626

eISSN： 1532-0634

More details Close

ABSTRACT Recently, X‐ray ptychography has attracted significant attention as a non‐destructive imaging technique with high spatial resolution. However, its application to real‐time imaging is limited by the long execution time required for iterative phase retrieval, which reconstructs sample images from diffraction patterns. To address this issue, deep learning‐based surrogate models have been proposed to accelerate iterative phase retrieval by directly predicting sample images. While these surrogate models achieve significant speed‐ups, they typically ignore the time needed for model training and dataset preparation, which can diminish their benefits. Consequently, conventional iterative phase retrieval may outperform surrogate‐based approaches in end‐to‐end performance. This study aims to implement real‐time X‐ray ptychography using surrogate models that explicitly incorporate model training and dataset preparation into the workflow. Specifically, we propose a method that constructs a sample‐specific surrogate model on‐the‐fly using a small subset of observed diffraction patterns and uses its predictions as initial estimates for iterative phase retrieval. The proposed method is up to 2.72 times faster than conventional iterative phase retrieval, even when including training and dataset preparation times. Moreover, the proposed method ensures that the reconstructed images satisfy physical constraints. Comprehensive performance evaluations further demonstrate that the trade‐off between model accuracy and preparation time is critical for optimizing the total execution time in the X‐ray ptychography workflow.
Power absorption and temperature rise in deep learning based head models for local radiofrequency exposures Peer-reviewed

Sachiko Kodera, Reina Yoshida, Essam A Rashed, Yinliang Diao, Hiroyuki Takizawa, Akimasa Hirata

Physics in Medicine & Biology　2025/03/16

DOI： 10.1088/1361-6560/adb935 　
Improving the Efficiency of a Deep Reinforcement Learning-Based Power Management System for HPC Clusters Using Curriculum Learning Peer-reviewed

Thomas Budiarjo, Santana Yuda Pradata, Kadek Gemilang Santiyuda, Muhammad Alfian Amrizal, Reza Pulungan, Hiroyuki Takizawa

Proceedings of the 2025 Supercomputing Asia Conference　1-13　2025/03/10
Publisher: ACM
DOI： 10.1145/3718350.3718359 　
Performance evaluation of the LBM simulations in fluid dynamics on SX-Aurora TSUBASA vector engine Peer-reviewed

Xiangcheng Sun, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa, Xian Wang

Computer Physics Communications　307　109411-109411　2025/02
Publisher: Elsevier BV
DOI： 10.1016/j.cpc.2024.109411 　

ISSN： 0010-4655
Clustering Based Job Runtime Prediction for Backfilling Using Classification Peer-reviewed

Hang Cui, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Lecture Notes in Computer Science　40-59　2024/12/21
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-74430-3_3 　

ISSN： 0302-9743

eISSN： 1611-3349
Maximizing Energy Budget Utilization Using Dynamic Power Cap Control Peer-reviewed

Sho Ishii, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Lecture Notes in Computer Science　161-180　2024/12/21
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-74430-3_9 　

ISSN： 0302-9743

eISSN： 1611-3349
A Node Selection Method for on-Demand Job Execution with Considering Deadline Constraints Peer-reviewed

Daiki Nakai, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Lecture Notes in Computer Science　141-160　2024/12/21
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-74430-3_8 　

ISSN： 0302-9743

eISSN： 1611-3349
Leveraging Hardware Performance Counters for Predicting Workload Interference in Vector Supercomputers Peer-reviewed

Shubham, Keichi Takahashi, Hiroyuki Takizawa

International Conference on Parallel and Distributed Computing: Applications and Technologies (PDCAT)　2024/12

DOI： 10.48550/arXiv.2410.18126 　
DRAS-OD: A Reinforcement Learning based Job Scheduler for On-Demand Job Scheduling in High-Performance Computing Systems Peer-reviewed

Hang Cui, Keichi Takahashi, Hiroyuki Takizawa

2024 Twelfth International Symposium on Computing and Networking (CANDAR)　21-29　2024/11/26
Publisher: IEEE
DOI： 10.1109/candar64496.2024.00011 　
Real-Time Phase Retrieval Using On-the-Fly Training of Sample-Specific Surrogate Models Peer-reviewed

Ryota Koda, Keichi Takahashi, Hiroyuki Takizawa, Nozomu Ishiguro, Yukio Takahashi

2024 Twelfth International Symposium on Computing and Networking (CANDAR)　59-66　2024/11/26
Publisher: IEEE
DOI： 10.1109/candar64496.2024.00015 　
A QA-Assisted Job Scheduler for Minimizing the Impact of Urgent Computing on HPC System Operation Peer-reviewed

Tatsuyoshi Ohmura, Keichi Takahashi, Ryusuke Egawa, Hiroyuki Takizawa

2024 Twelfth International Symposium on Computing and Networking Workshops (CANDARW)　197-203　2024/11/26
Publisher: IEEE
DOI： 10.1109/candarw64572.2024.00039 　
Modernizing an Operational Real-Time Tsunami Simulator to Support Diverse Hardware Platforms Peer-reviewed

Keichi Takahashi, Takashi Abe, Akihiro Musa, Yoshihiko Sato, Yoichi Shimomura, Hiroyuki Takizawa, Shunichi Koshimura

2024 IEEE International Conference on Cluster Computing (CLUSTER)　414-425　2024/09/24
Publisher: IEEE
DOI： 10.1109/cluster59578.2024.00043 　
XAI-Based Feature Importance Analysis on Loop Optimization Peer-reviewed

Toshinobu Katayama, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)　38　782-791　2024/05/27
Publisher: IEEE
DOI： 10.1109/ipdpsw63119.2024.00142 　
Combining Lossy Compression with Multi-Level Caching for Data Staging over Network Peer-reviewed

Rei Aoyagi, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)　41　212-221　2024/05/27
Publisher: IEEE
DOI： 10.1109/ipdpsw63119.2024.00059 　
Towards sub-10 nm spatial resolution by tender X-ray ptychographic coherent diffraction imaging Peer-reviewed

Nozomu Ishiguro, Fusae Kaneko, Masaki Abe, Yuki Takayama, Junya Yoshida, Taiki Hoshino, Shuntaro Takazawa, Hideshi Uematsu, Yuhei Sasaki, Naru Okawa, Keichi Takahashi, Hiroyuki Takizawa, Hiroyuki Kishimoto, Yukio Takahashi

Applied Physics Express　17　(5)　2024/05/01

DOI： 10.35848/1882-0786/ad4846 　

ISSN： 1882-0778

eISSN： 1882-0786
AOBA: The Most Powerful Vector Supercomputer in the World Invited

Hiroyuki Takizawa, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Kenji Oizumi, Satoshi Ono, Takeshi Yamashita, Atsuko Saito

Sustained Simulation Performance 2022　71-81　2024/03/15
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-41073-4_6 　
Reuse distance-based shared LLC management mechanism for heterogeneous CPU-GPU systems Peer-reviewed

Jiaheng Liu, Ryusuke Egawa, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

IEICE Electronics Express　21　(4)　20230520-20230520　2024/02/25
Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)
DOI： 10.1587/elex.21.20230520 　

eISSN： 1349-2543
Current Status and Future of the ABINIT-MP Program

Yuji MOCHIZUKI, Tatsuya NAKANO, Kota SAKAKURA, Hideo DOI, Koji OKUWAKI, Toshihiro KATO, Hiroyuki TAKIZAWA, Satoshi OHSHIMA, Tetsuya HOSHINO, Takahiro KATAGIRI

Journal of Computer Chemistry, Japan　2024
Publisher: Society of Computer Chemistry Japan
DOI： 10.2477/jccj.2024-0022 　

ISSN： 1347-1767

eISSN： 1347-3824
Development Status of ABINIT-MP in 2023 Peer-reviewed

Yuji MOCHIZUKI, Tatsuya NAKANO, Kota SAKAKURA, Koji OKUWAKI, Hideo DOI, Toshihiro KATO, Hiroyuki TAKIZAWA, Akira NARUSE, Satoshi OHSHIMA, Tetsuya HOSHINO, Takahiro KATAGIRI

Journal of Computer Chemistry, Japan　23　(1)　4-8　2024
Publisher: Society of Computer Chemistry Japan
DOI： 10.2477/jccj.2024-0001 　

ISSN： 1347-1767

eISSN： 1347-3824
Association of nuclear cataract prevalence with UV radiation and heat load in lens of older people -five city study- Peer-reviewed

Kotaro Kinoshita, Sachiko Kodera, Natsuko Hatsusaka, Ryusuke Egawa, Hiroyuki Takizawa, Eri Kubo, Hiroshi Sasaki, Akimasa Hirata

Environmental Science and Pollution Research　30　(59)　123832-123842　2023/11/22
Publisher: Springer Science and Business Media LLC
DOI： 10.1007/s11356-023-31079-2 　

eISSN： 1614-7499
Prototype of a Batched Quantum Circuit Simulator for the Vector Engine Peer-reviewed

Keichi Takahashi, Toshio Mori, Hiroyuki Takizawa

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis　1499-1505　2023/11/12
Publisher: ACM
DOI： 10.1145/3624062.3624226 　
Conflict-aware workload co-execution on SX-aurora TSUBASA Peer-reviewed

Riku Nunokawa, Yoichi Shimomura, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

CCF Transactions on High Performance Computing　4　(6)　425-438　2023/10/05
Publisher: Springer Science and Business Media LLC
DOI： 10.1007/s42514-023-00171-x 　

ISSN： 2524-4922

eISSN： 2524-4930

More details Close

Abstract NEC SX-Aurora TSUBASA (SX-AT) is the latest vector supercomputer, consisting of host processors called Vector Hosts (VHs) and vector processors called Vector Engines (VEs). The goal of this work is to simultaneously use both VHs and VEs to increase the resource utilization and improve the system throughput by co-executing more workloads. One difficulty is that performance interferences among VH and VE workloads could occur because they share some computing resources and potentially compete to use the same resource at the same time, so-called resource conflicts. To achieve efficient workload co-execution, first, this paper experimentally investigates the performance interference between a VH and a VE, when each of the two processors executes a different workload. It is empirically shown that the frequency of system calls from the VE workload could be a good indicator to predict if the co-execution could cause severe performance interference, even though monitoring system calls requires a huge runtime overhead and it is impractical to simply use it for decision making of co-execution. Then, this paper proposes a workload co-execution strategy based on a practical approach to identifying a pair of VE and VH workloads that could cause severe performance interferences. Our evaluation results clearly demonstrate that the system call frequency can be used to predict if the workload can affect the performance of another co-executing workload, and VH’s CPU load can be a good approximation of the system call frequency. The proposed approach based on the CPU loads could accurately identify a pair of workloads causing frequent resource conflicts, and thus reduce the risk of severe performance interferences between co-executing workloads on an SX-AT system, resulting in shorter makespan without significantly increasing the turn-around time.
Balancing exploitation and exploration in parallel Bayesian optimization under computing resource constraint Peer-reviewed

Moto Satake, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)　706-713　2023/05
Publisher: IEEE
DOI： 10.1109/ipdpsw59300.2023.00122 　
An Advantage Actor-Critic Deep Reinforcement Learning Method for Power Management in HPC Systems Peer-reviewed

Fitra Rahmani Khasyah, Kadek Gemilang Santiyuda, Gabriel Kaunang, Faizal Makhrus, Muhammad Alfian Amrizal, Hiroyuki Takizawa

Lecture Notes in Computer Science　94-107　2023/04/08
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-29927-8_8 　

ISSN： 0302-9743

eISSN： 1611-3349
Equivalence Checking of Code Transformation by Numerical and Symbolic Approaches Peer-reviewed

Shunpei Sugawara, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

Parallel and Distributed Computing, Applications and Technologies　373-386　2023/04/08
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-29927-8_29 　

ISSN： 0302-9743

eISSN： 1611-3349
Towards Priority-Flexible Task Mapping for Heterogeneous Multi-core NUMA Systems Peer-reviewed

Yifan Jin, Mulya Agung, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Parallel and Distributed Computing, Applications and Technologies　3-15　2023/04/08
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-29927-8_1 　

ISSN： 0302-9743

eISSN： 1611-3349
A Task-Parallel Runtime for Heterogeneous Multi-node Vector Systems Peer-reviewed

Kazuki Ide, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

Parallel and Distributed Computing, Applications and Technologies　331-343　2023/04/08
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-29927-8_26 　

ISSN： 0302-9743

eISSN： 1611-3349
Xevolver for Performance Tuning of C Programs Invited

Hiroyuki Takizawa, Shunpei Sugawara, Yoichi Shimomura, Keichi Takahashi, Ryusuke Egawa

Sustained Simulation Performance 2021　85-93　2023/02/18
Publisher: Springer International Publishing
DOI： 10.1007/978-3-031-18046-0_6 　
Estimation of the number of heat illness patients in eight metropolitan prefectures of Japan: Correlation with ambient temperature and computed thermophysiological responses Peer-reviewed

Akito Takada, Sachiko Kodera, Koji Suzuki, Mio Nemoto, Ryusuke Egawa, Hiroyuki Takizawa, Akimasa Hirata

Frontiers in Public Health　11　2023/02/17
Publisher: Frontiers Media SA
DOI： 10.3389/fpubh.2023.1061135 　

eISSN： 2296-2565

More details Close

The number of patients with heat illness transported by ambulance has been gradually increasing due to global warming. In intense heat waves, it is crucial to accurately estimate the number of cases with heat illness for management of medical resources. Ambient temperature is an essential factor with respect to the number of patients with heat illness, although thermophysiological response is a more relevant factor with respect to causing symptoms. In this study, we computed daily maximum core temperature increase and daily total amount of sweating in a test subject using a large-scale, integrated computational method considering the time course of actual ambient conditions as input. The correlation between the number of transported people and their thermophysiological temperature is evaluated in addition to conventional ambient temperature. With the exception of one prefecture, which features a different Köppen climate classification, the number of transported people in the remaining prefectures, with a Köppen climate classification of Cfa, are well estimated using either ambient temperature or computed core temperature increase and daily amount of sweating. For estimation using ambient temperature, an additional two parameters were needed to obtain comparable accuracy. Even using ambient temperature, the number of transported people can be estimated if the parameters are carefully chosen. This finding is practically useful for the management of ambulance allocation on hot days as well as public enlightenment.
Toward Building a Digital Twin of Job Scheduling and Power Management on an HPC System Peer-reviewed

Tatsuyoshi Ohmura, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

Lecture Notes in Computer Science　47-67　2023/01/12
Publisher: Springer Nature Switzerland
DOI： 10.1007/978-3-031-22698-4_3 　

ISSN： 0302-9743

eISSN： 1611-3349
Efficient Pause Location Prediction Using Quantum Annealing Simulations and Machine Learning. Peer-reviewed

Michael R. Zielewski, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

IEEE Access　11　104285-104294　2023

DOI： 10.1109/ACCESS.2023.3317698 　
Performance Evaluation of a Next-Generation SX-Aurora TSUBASA Vector Supercomputer. Peer-reviewed

Keichi Takahashi, Soya Fujimoto, Satoru Nagase, Yoko Isobe, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

ISC High Performance　359-378　2023

DOI： 10.1007/978-3-031-32041-5_19 　
A Real-time Flood Inundation Prediction on SX-Aurora TSUBASA Peer-reviewed

Yoichi Shimomura, Akihiro Musa, Yoshihiko Sato, Atsuhiko Konja, Guoqing Cui, Rei Aoyagi, Keichi Takahashi, Hiroyuki Takizawa

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)　27　192-197　2022/12
Publisher: IEEE
DOI： 10.1109/hipc56025.2022.00035 　
mdx: A Cloud Platform for Supporting Data Science and Cross-Disciplinary Research Collaborations Peer-reviewed

Toyotaro Suzumura, Akiyoshi Sugiki, Hiroyuki Takizawa, Akira Imakura, Hiroshi Nakamura, Kenjiro Taura, Tomohiro Kudoh, Toshihiro Hanawa, Yuji Sekiya, Hiroki Kobayashi, Yohei Kuga, Ryo Nakamura, Renhe Jiang, Junya Kawase, Masatoshi Hanai, Hiroshi Miyazaki, Tsutomu Ishizaki, Daisuke Shimotoku, Daisuke Miyamoto, Kento Aida, Atsuko Takefusa, Takashi Kurimoto, Koji Sasayama, Naoya Kitagawa, Ikki Fujiwara, Yusuke Tanimura, Takayuki Aoki, Toshio Endo, Satoshi Ohshima, Keiichiro Fukazawa, Susumu Date, Toshihiro Uchibayashi

2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)　2022/09/12
Publisher: IEEE
DOI： 10.1109/dasc/picom/cbdcom/cy55231.2022.9927975 　
A SYCL-based high-level programming framework for HPC programmers to use remote FPGA clusters Peer-reviewed

Satoshi Kaneko, Hiroyuki Takizawa, Kentaro Sano

International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies　92-94　2022/06/09
Publisher: ACM
DOI： 10.1145/3535044.3535058 　
A Conflict-Aware Capacity Control Mechanism for Deep Cache Hierarchy Peer-reviewed

Jiaheng LIU, Ryusuke EGAWA, Hiroyuki TAKIZAWA

IEICE Transactions on Information and Systems　E105.D　(6)　1150-1163　2022/06/01
Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)
DOI： 10.1587/transinf.2021edp7201 　

ISSN： 0916-8532

eISSN： 1745-1361
Towards Conflict-Aware Workload Co-execution on SX-Aurora TSUBASA Peer-reviewed

Riku Nunokawa, Yoichi Shimomura, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

Lecture Notes in Computer Science　163-174　2022/03/16
Publisher: Springer International Publishing
DOI： 10.1007/978-3-030-96772-7_16 　

ISSN： 0302-9743

eISSN： 1611-3349
Evaluating the Performance and Conformance of a SYCL Implementation for SX-Aurora TSUBASA Peer-reviewed

Jiahao Li, Mulya Agung, Hiroyuki Takizawa

Lecture Notes in Computer Science　36-47　2022/03/16
Publisher: Springer International Publishing
DOI： 10.1007/978-3-030-96772-7_4 　

ISSN： 0302-9743

eISSN： 1611-3349
A Method for Reducing Time-to-Solution in Quantum Annealing Through Pausing Peer-reviewed

Michael Ryan Zielewski, Hiroyuki Takizawa

International Conference on High Performance Computing in Asia-Pacific Region　7　137-145　2022/01/07
Publisher: ACM
DOI： 10.1145/3492805.3492815 　
A Cost Model for Compilers Based on Transfer Learning. Peer-reviewed

Yuta Sasaki, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

IPDPS Workshops　942-951　2022

DOI： 10.1109/IPDPSW55747.2022.00152 　
Automated selection of build configuration based on machine learning. Peer-reviewed

Reo Furuhata, Minglu Zhao, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

IPDPS Workshops　934-941　2022

DOI： 10.1109/IPDPSW55747.2022.00151 　
Spatiotemporal Anomaly Detection for Large-Scale Sensor Data Peer-reviewed

Minglu Zhao, Hiroyuki Takizawa, Tomoya Soma

2021 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)　2021/12/10
Publisher: IEEE
DOI： 10.1109/paap54281.2021.9720310 　
Portability of Vectorization-aware Performance Tuning Expertise across System Generations Peer-reviewed

Shunpei Sugawara, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)　30　242-248　2021/12
Publisher: IEEE
DOI： 10.1109/mcsoc51149.2021.00043 　
A memory bank conflict prevention mechanism for SYCL on SX-Aurora TSUBASA Peer-reviewed

Wenbin Wang, Jiahao Li, Yohichi Shimomura, Hiroyuki Takizawa

2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW)　2　217-222　2021/11
Publisher: IEEE
DOI： 10.1109/candarw53999.2021.00043 　
Evaluating I/O Acceleration Mechanisms of SX-Aurora TSUBASA Peer-reviewed

Yuta Sasaki, Ayumu Ishizuka, Mulya Agung, Hiroyuki Takizawa

2021 IEEE International Parallel & Distributed Processing Symposium Workshops　2021/05
OpenCL-like offloading with metaprogramming for SX-Aurora TSUBASA Peer-reviewed

Hiroyuki Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa

Parallel Computing　102　102754-102754　2021/05
Publisher: Elsevier {BV}
DOI： 10.1016/j.parco.2021.102754 　

ISSN： 0167-8191
Evaluation of flood damage reduction throughout Japan from adaptation measures taken under a range of emissions mitigation scenarios Peer-reviewed

Tao Yamamoto, So Kazama, Yoshiya Touge, Hayata Yanagihara, Tsuyoshi Tada, Takeshi Yamashita, Hiroyuki Takizawa

Climatic Change　165　(60)　2021/04
Publisher: Springer Science and Business Media LLC
DOI： 10.1007/s10584-021-03081-5 　

ISSN： 0165-0009

eISSN： 1573-1480
Preemptive Parallel Job Scheduling for Heterogeneous Systems Supporting Urgent Computing Peer-reviewed

Mulya Agung, Yuta Watanabe, Henning Weber, Ryusuke Egawa, Hiroyuki Takizawa

IEEE Access　9　17557-17571　2021
Publisher: Institute of Electrical and Electronics Engineers ({IEEE})
DOI： 10.1109/ACCESS.2021.3053162 　

eISSN： 2169-3536
neoSYCL: a SYCL implementation for SX-Aurora TSUBASA Peer-reviewed

Yinan Ke, Mulya Agung, Hiroyuki Takizawa

International Conference on High Performance Computing in ASia-Pacific Region　2021/01
Improving Quantum Annealing Performance on Embedded Problems Invited Peer-reviewed

Zielewski, M.R., Agung, M., Egawa, R., Takizawa, H.

Supercomputing Frontiers and Innovations　7　(4)　2020/12

DOI： 10.14529/js?200403 　

ISSN： 2313-8734 2409-6008
Failure Prediction in Datacenters Using Unsupervised Multimodal Anomaly Detection Peer-reviewed

Minglu Zhao, Reo Furuhata, Mulya Agung, Hiroyuki Takizawa, Tomoya Soma

The IEEE BigData 2020, the third international conference on the Internet of Things Data Analytics (IoTDA)　2020/12
A Conflict-Aware Capacity Control Mechanism for Last-Level Cache Peer-reviewed

Jiaheng Liu, Ryusuke Egawa, Mulya Agung, Hiroyuki Takizawa

Proceedings - 2020 8th International Symposium on Computing and Networking Workshops, CANDARW 2020　416-420　2020/11/01
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/CANDARW51189.2020.00085 　
Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA Peer-reviewed

Ryusuke Egawa, Souya Fujimoto, Tsuyoshi Yamashita, Daisuke Sasaki, Yoko Isobe, Yoichi Shimomura, Hiroyuki Takizawa

The 11th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS’20)　2020/11
Improving the accuracy in SpMV implementation selection with machine learning Peer-reviewed

Reo Furuhata, Minglu Zhao, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

The Eighth International Conference on Computing and Networking Workshops (CANDARW)　2020/11
Polymorphic Data Layout for SX-Aurora TSUBASA Vector Engines Peer-reviewed

Naoki Ebata, Yoko Isobe, Ryusuke Egawa, Hiroyuki Takizawa

The Eighth International Conference on Computing and Networking (CANDAR)　2020/11
ベイズ最適化による洪水シミュレーションコードの負荷分散自動調整 Peer-reviewed

Ayumu Ishiduka, Tsuyoshi Yamashita, Ryusuke Egawa, Hiroyuki Takizawa, Tao Yamamoto, So Kazama

The 4-th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG2020)　2020/07
Quantum Compiler : Automatic Vectorization Assisted by Quantum Annealer Peer-reviewed

Yuta Sasaki, Michael Ryan Zielewski, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

The ISC High Performance 2020 (poster)　2020/06
Challenges in Solving Scheduling Problems with the D-Wave Quantum Annealer Peer-reviewed

Michael Ryan Zielewski, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

The ISC High Performance 2020 (poster)　2020/06
Automatically Avoiding Memory Access Conflicts on SX-Aurora TSUBASA Peer-reviewed

Naoki Ebata, Ryusuke Egawa, Yoko Isobe, Ryoji Takaki, Hiroyuki Takizawa

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)　2020/05
Publisher: IEEE
DOI： 10.1109/ipdpsw50202.2020.00139 　
Task Priority Control for the HPX Runtime System Peer-reviewed

Suhang Jiang, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)　2020/05
Publisher: IEEE
DOI： 10.1109/ipdpsw50202.2020.00137 　
Comparison of Direct and Indirect Networks for High-Performance FPGA Clusters Peer-reviewed

Antoniette Mondigo, Tomohiro Ueno, Kentaro Sano, Hiroyuki Takizawa

Applied Reconfigurable Computing. Architectures, Tools, and Applications　314-329　2020/04
Publisher: Springer International Publishing
DOI： 10.1007/978-3-030-44534-8_24 　

ISSN： 0302-9743

eISSN： 1611-3349
Xevolver: A code transformation framework for separation of system-awareness from application codes Peer-reviewed

Kazuhiko Komatsu, Ayumu Gomi, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda, Hiroyuki Takizawa

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE　32　(7)　2020/04

DOI： 10.1002/cpe.5577 　

ISSN： 1532-0626

eISSN： 1532-0634
Online MPI process mapping for coordinating locality and memory congestion on NUMA systems Peer-reviewed

Agung, M., Amrizal, M.A., Egawa, R., Takizawa, H.

Supercomputing Frontiers and Innovations　7　(1)　71-90　2020/03
Publisher: FSAEIHE South Ural State University (National Research University)
DOI： 10.14529/js200104 　

ISSN： 2313-8734 2409-6008
Exafsa: Parallel fluid-structure-acoustic simulation

Florian Lindner, Amin Totounferoush, Miriam Mehl, Benjamin Uekermann, Neda Ebrahimi Pour, Verena Krupp, Sabine Roller, Thorsten Reimann, Dörte C. Sternel, Ryusuke Egawa, Hiroyuki Takizawa, Frédéric Simonis

Lecture Notes in Computational Science and Engineering　136　271-300　2020
Publisher: Springer
DOI： 10.1007/978-3-030-47956-5_10 　

ISSN： 2197-7100 1439-7358
Preliminary Evaluation towards Task Priority Control in HPX Peer-reviewed

Suhang Jiang, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster)　2020/01
Acceleration of Hyper-Parameter Auto-Tuning with Parallelization and Time Constraints Peer-reviewed

Chaoyi Zhang, Ryusuke Egawa, Hiroyuki Takizawa

International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster)　2020/01
An Optimization Technology of Software Auto-Tuning Applied to Machine Learning Software Peer-reviewed

Toshiki Tabeta, Naoto Seki, Akihiro Fujii, Teruo Tanaka, Hiroyuki Takizawa

International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster)　2020/01
DeLoc: A Locality and Memory-Congestion-Aware Task Mapping Method for Modern NUMA Systems Peer-reviewed

Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

IEEE Access　8　6937-6953　2020
Publisher: Institute of Electrical and Electronics Engineers ({IEEE})
DOI： 10.1109/ACCESS.2019.2963726 　
An OpenCL-like Offload Programming Framework for SX-Aurora TSUBASA Peer-reviewed

Hiroyuki Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa

The 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2019)　285-291　2019/12
Peachy Parallel Assignments (EduHPC 2019)

Mulya Agung, Allen Malony, Hiroyuki Takizawa, David P. Bunde, Muhammad A. Amrizal, Steven Bogaerts, Ryusuke Egawa, Daniel A. Ellsworth, Jorge Fernandez-Fabeiro, Arturo Gonzalez-Escribano, Sukhamay Kundu, Alina Lazar

Proceedings of EduHPC 2019: Workshop on Education for High Performance Computing - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis　75-83　2019/11/01
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/EduHPC49559.2019.00015 　
An Automatic MPI Process Mapping Method Considering Locality and Memory Congestion on NUMA Systems Peer-reviewed

Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)　17-24　2019/09
Optimization of a gas-particle flow solver on vector supercomputers Peer-reviewed

Yoichi Shimomura, Midori Kano, Takashi Soga, Kenta Yamaguchi, Akihiro Musa, Yusuke Mizuno, Shun Takahashi, Ryusuke Egawa, Hiroyuki Takizawa

The 31st International Conference on Parallel Computational Fluid Dynamics (ParCFD’2019)　1-4　2019/06
Memory First : A Performance Tuning Strategy Focusing on Memory Access Patterns Peer-reviewed

Naoki Ebata, Ryusuke Egawa, Yoko Isobe, Ryoji Takaki, Hiroyuki Takizawa

The ISC High Performance conference 2019 (poster)　2019/06
Scaling performance for n-body stream computation with a ring of FPGAs Peer-reviewed

Jens Huthmann, Abiko Shin, Artur Podobas, Kentaro Sano, Hiroyuki Takizawa

The International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART2019)　1-6　2019/06
Scalability Analysis of Deeply Pipelined Tsunami Simulation with Multiple FPGAs Peer-reviewed

Antoniette Mondigo, Tomohiro Ueno, Kentaro Sano, Hiroyuki Takizawa

IEICE Transactions on Information and Systems　E102-D　(5)　1029-1036　2019/05
Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)
DOI： 10.1587/transinf.2018RCP0007 　

ISSN： 0916-8532

eISSN： 1745-1361
An Energy Optimization Method for Hybrid In-Memory Checkpointing Peer-reviewed

Muhammad Alfian Amrizal, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)(poster)　2019/04
The Impacts of Locality and Memory Congestion-aware Thread Mapping on Energy Consumption of Modern NUMA Systems Peer-reviewed

Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)　2019/04
Performance Evaluation of Different Implementation Schemes of an Iterative Flow Solver on Modern Vector Machines Peer-reviewed

Kenta Yamaguchi, Takashi Soga, Yoichi Shimomura, Thorsten Reimann, Kazuhiko Komatsu, Ryusuke Egawa, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

Supercomputing Frontiers and Innovations　6　(1)　36-47　2019/03

DOI： 10.14529/jsfi190106 　
Xevolver: A user-defined code transformation approach to streamlining legacy code migration

Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project　163-181　2018/12/06
Publisher: Springer Singapore
DOI： 10.1007/978-981-13-1924-2_9 　
Enhancing memory bandwidth in a single stream computation with multiple FPGAs Peer-reviewed

Antoniette Mondigo, Kentaro Sano, Hiroyuki Takizawa

The 2018 International Conference on Field-Programmable Technology (FPT’18)　2018/12
Automatic hyperparameter tuning of machine learning models under time constraints Peer-reviewed

Zhen Wang, Agung Mulya, Ryusuke Egawa, Reiji Suda, Hiroyuki Takizawa

IEEE Big Data 2018 Workshop　2018/12
A Locality and Memory Congestion-aware Thread Mapping Method for Modern NUMA Systems Peer-reviewed

Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

ACM/IEEE Supercomputing Conference 2018 (SC18) (poster)　2018/11
Preconditioner auto-tuning with deep learning for sparse iterative algorithms Peer-reviewed

Kenya Yamada, Takahiro Katagiri, Hiroyuki Takizawa, Kazuo Minami, Mitsuo Yokokawa, Toru Nagai, Masao Ogino

The Sixth International Symposium on Computing and Networking Workshops (CANDARW 2018), LHAM workshop　2018/11
Investigating the Effects of Dynamic Thread Team Size Adjustment for Irregular Applications Peer-reviewed

Xiong Xiao, Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

The Sixth International Symposium on Computing and Networking (CANDAR 2018)　2018/11
A Failure Prediction-based Adaptive Checkpointing Method with Less Reliance on Temperature Monitoring for HPC Applications Peer-reviewed

Muhammad Alfian Amrizal, Pei Li, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

2018 IEEE International Conference on Cluster Computing, FTS workshop　483-491　2018/09
A machine learning-based approach for selecting SpMV kernels and matrix storage formats Peer-reviewed

Cui, H., Hirasawa, S., Kobayashi, H., Takizawa, H.

IEICE Transactions on Information and Systems　E101D　(9)　2307-2314　2018/09

DOI： 10.1587/transinf.2017EDP7176 　

ISSN： 1745-1361 0916-8532
Expressing the Differences in Code Optimizations between Intel Knights Landing and NEC SX-ACE Processors

Hiroyuki Takizawa, Thorsten Reimann, Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

The 13th World Congress on Computational Mechanics/2nd Pan American Congress on Computational Mechanics　2018/07
Performance Estimation of Deeply Pipelined Fluid Simulation on Multiple FPGAs with High-speed Communication Subsystem Peer-reviewed

Antoniette Mondigo, Ketnaro Sano, Hiroyuki Takizawa

The 29th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2018)　10-12　2018/07
MIGRATING AN OLD VECTOR CODE TO MODERN VECTOR MACHINES Peer-reviewed

Hiroyuki Takizawa, Kenta Yamaguchi, Takashi Soga, Thorsten Reimann, Kazuhiko Komatsu, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

30th International Conference on Parallel Computational Fluid Dynamics　2018/04
反応・相変化を伴う多分散系混相流シミュレーションコードの最適化

佐々木, 大輔, 加藤, 季広, 磯部, 洋子, 笠原, 弘貴, 渡部, 広吾輝, 志村, 啓, 奥野, 航平, 松尾, 亜紀子, 江川, 隆輔, 滝沢, 寛之, 小林, 広明

SENAC : 東北大学大型計算機センター広報　51　(1)　47-51　2018/01
Publisher: 東北大学サイバーサイエンスセンター
ISSN： 0286-7419

More details Close

紀要類（bulletin）
Use of Code Structural Features for Machine Learning to Predict Effective Optimizations. Peer-reviewed

Yuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa

33rd IEEE International Parallel & Distributed Processing Symposium Workshops(IPDPSW), International Workshop on Automatic Performance Tuning　1049-1055　2018
Publisher: IEEE Computer Society
DOI： 10.1109/IPDPSW.2018.00163 　
Energy-Performance Modeling of Speculative Checkpointing for Exascale Systems Peer-reviewed

Muhammad Alfian Amrizal, Atsuya Uno, Yukinori Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E100D　(12)　2749-2760　2017/12

DOI： 10.1587/transinf.2017PAP0002 　

ISSN： 1745-1361
Optimizing Energy Consumption on HPC Systems with a Multi-Level Checkpointing Mechanism Peer-reviewed

Muhammad Alfian Amrizal, Hiroyuki Takizawa

2017 IEEE International Conference on Networking, Architecture, and Storage, NAS 2017 - Proceedings　2017/09/06
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/NAS.2017.8026868 　
Potential of a modern vector supercomputer for practical applications: performance evaluation of SX-ACE Peer-reviewed

Ryusuke Egawa, Kazuhiko Komatsu, Shintaro Momose, Yoko Isobe, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

JOURNAL OF SUPERCOMPUTING　73　(9)　3948-3976　2017/09

DOI： 10.1007/s11227-017-1993-y 　

ISSN： 0920-8542

eISSN： 1573-0484
A customizable auto-tuning scenario with user-defined code transformations Peer-reviewed

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Daisuke Takahashi

Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017　1372-1378　2017/06/30
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/IPDPSW.2017.79 　
機械学習によるコード最適化の可能性

滝沢寛之, 崔航, 平澤将一

計算工学講演会論文集　22　2017/06
データレイアウト最適化のためのコード変換規則の自動生成

Takeshi Yamada, Shoichi Hirasawa, Reiji Suda, Hiroyuki Takizawa

IPSJ SIG Technical Reports (HPC)　2017-HPC-158　(28)　1-8　2017/03
シナリオテンプレートを用いた自動チューニングに関する研究

Daichi Sato, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ National Convention　2017　(1)　45-46　2017/03
Toward Dynamic Load Balancing across OpenMP Thread Teams for Irregular Workloads Peer-reviewed

Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

International Journal of Networking and Computing　7　(2)　387-404　2017
Publisher: IJNC Editorial Committee
DOI： 10.15803/ijnc.7.2_387 　

ISSN： 2185-2839

More details Close

In the field of high performance computing, massively-parallel many-core processors such as Intel Xeon Phi coprocessors are becoming popular because they can significantly accelerate various applications. In order to efficiently parallelize applications for such many-core processors, several high-level programming models have been proposed. The de facto standard programming model mainly for shared-memory parallel processing is OpenMP. For hierarchical parallel processing, OpenMP version 4.0 or later allows programmers to create multiple thread teams. Each thread team contains a bunch of newly-created synchronizable threads. When multiple thread teams are used to execute an application, it is important to have dynamic load balancing across thread teams, since static load balancing easily encounters load imbalance across teams, and thus degrades performance. In this paper, we first motivate our work by clarifying the benefit of using multiple thread teams to execute an irregular workload on a many-core processor. Then, we demonstrate that dynamic load balancing across those thread teams has a potential of significantly improving the performance of irregular workloads on a many-core processor, with considering the scheduling overhead. Although such a dynamic load balancing mechanism has not been provided by the current OpenMP specification, the benefits of dynamic load balancing across thread teams are discussed through experiments using the Intel Xeon Phi coprocessor. We evaluate the performance gain of dynamic load balancing across thread teams using a ray tracing code. The results show that such a dynamic load balancing mechanism can improve the performance by up to 14% compared to static load balancing across teams, with considering scheduling overhead.
A Directive Generation Approach to High Code-Maintainability for Various HPC Systems. Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Int. J. Netw. Comput.　7　(2)　405-418　2017
Vectorization-aware Loop Optimization with User-defined Code Transformations Peer-reviewed

Hiroyuki Takizawa, Thorsten Reimann, Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER)　685-692　2017

DOI： 10.1109/CLUSTER.2017.102 　

ISSN： 1552-5244
Performance and Power Analysis of SX-ACE using HP-X Benchmark Programs Peer-reviewed

Ryusuke Egawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Akihiro Musa, Hiroaki Kobayashi, Yoko Isobe, Toshihiro Kato, Souya Fujimoto

2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER)　693-700　2017

DOI： 10.1109/CLUSTER.2017.65 　

ISSN： 1552-5244
An Application-Level Incremental Checkpointing Mechanism with Automatic Parameter Tuning Peer-reviewed

Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　389-394　2017

DOI： 10.1109/CANDAR.2017.96 　

ISSN： 2379-1888
Designing an Open Database of System-aware Code Optimizations Peer-reviewed

Ryusuke Egawa, Kazuhiko Komatsu, Hiroyuki Takizawa

2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　369-374　2017

DOI： 10.1109/CANDAR.2017.102 　

ISSN： 2379-1888
A Memory Congestion-aware MPI Process Placement for Modern NUMA Systems Peer-reviewed

Mulya Agung, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa

2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC)　152-161　2017

DOI： 10.1109/HiPC.2017.00026 　

ISSN： 1094-7256
Directive Translation for Various HPC Systems Using the Xevolver Framework Invited

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2016　109-117　2016/12

DOI： 10.1007/978-3-319-46735-1_9 　
Making a Legacy Code Auto-tunable without Messing It Up Peer-reviewed

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference 2016 (SC16)　2016/11
A Power-Performance Tradeoff of HBM by Limiting Access Channels Peer-reviewed

Takuya Toyoshima, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of IEEE Symposium on Low-Power and High-Speed Chips　2016/04
アプリケーション適応型キャッシュリサイズのためのバイパス機構 Peer-reviewed

Masayuki Sato, Takumi Takai, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

電子情報通信学会論文誌　J99-D　(3)　2016/03
機械学習を用いたコード変換に関する研究

川原畑勇希, 平澤将一, 滝沢寛之, 小林広明

電気関係学会東北支部連合大会講演論文集　2016　227-227　2016
Publisher: 電気関係学会東北支部連合大会実行委員会
DOI： 10.11528/tsjc.2016.0_227 　
Automatic Parameter Tuning of Stencil Computation Using Directives Peer-reviewed

Takuya Tsunogawa, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing Systems　2016
A Cache Partitioning Mechanism to Protect Shared Data for CMPs Peer-reviewed

Masayuki Sato, Shin Nishimura, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2016 IEEE SYMPOSIUM IN LOW-POWER AND HIGH-SPEED CHIPS (COOL CHIPS XIX)　2016

DOI： 10.1109/CoolChips.2016.7503674 　

ISSN： 2473-4683
Translation of Large-Scale Simulation Codes for an OpenACC Platform Using the Xevolver Framework. Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken'ichi Itakura, Hiroaki Kobayashi

Int. J. Netw. Comput.　6　(2)　167-180　2016
A Code Selection Mechanism Using Deep Learning Peer-reviewed

Hang Cui, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2016 IEEE 10TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC)　385-392　2016

DOI： 10.1109/MCSoC.2016.46 　
The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads Peer-reviewed

Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　529-535　2016

DOI： 10.1109/CANDAR.2016.48 　

ISSN： 2379-1888
A Directive Generation Approach Using User-defined Rules Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　515-521　2016

DOI： 10.1109/CANDAR.2016.94 　

ISSN： 2379-1888
A User-Defined Code Transformation Approach to Overlapping MPI Communication with Computation Peer-reviewed

Yasuharu Hayashi, Hiroyuki Takizawa, Hiroaki Kobayashi

2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　508-514　2016

DOI： 10.1109/CANDAR.2016.35 　

ISSN： 2379-1888
Xevdriver: A software system supporting XML-based source-to-source code transformations on Fortran programs Peer-reviewed

Reiji Suda, Hiroyuki Takizawa

2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　522-528　2016

DOI： 10.1109/CANDAR.2016.113 　

ISSN： 2379-1888
Performance Evaluation of Compiler-Assisted OpenMP Codes on Various HPC Systems Invited

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2015　147-157　2015/12

DOI： 10.1007/978-3-319-20340-9_12 　
A Light-Weight Rollback Mechanism for Testing Kernel Variants in Auto-Tuning Peer-reviewed

Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E98D　(12)　2178-2186　2015/12

DOI： 10.1587/transinf.2015PAP0028 　

ISSN： 1745-1361
An approach to the highest efficiency of the HPCG benchmark on the SX-ACE supercomputer Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Yoko Isobe, Ryusei Ogata, Hiroyuki Takizawa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference 2015 (SC15)　1-2　2015/11
Expressing system-awareness as code transformations for performance portability across diverse HPC systems Peer-reviewed

Hiroyuki Takizawa, Shoichi Hirasawa, Kazuhiko Komatsu, Ryusuke Egawa, Hiroaki Kobayashi

International Workshop on Portability Among HPC Architectures for Scientific Applications 2015　1-6　2015/11
FLEXII: A Flexible Insertion Policy for Dynamic Cache Resizing Mechanisms Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON ELECTRONICS　E98C　(7)　550-558　2015/07

DOI： 10.1587/transele.E98.C.550 　

ISSN： 1745-1353
Xevolver による実アプリケーションの性能と保守性の両立

平澤将一, 滝沢寛之, 小林広明

計算工学講演会論文集　20　4p　2015/06
Publisher:
Performance Evaluation of an OpenMP Parallelization by Using Automatic Parallelization Information

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2014　119-126　2015
Publisher: Springer International Publishing
DOI： 10.1007/978-3-319-10626-7_10 　
A Data Management Policy for Energy-Efficient Cache Mechanisms

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2015　61-75　2015

DOI： 10.1007/978-3-319-20340-9_6 　
Automatic Parameter Tuning of Hierarchical Incremental Checkpointing Peer-reviewed

Alfian Amrizal, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2014　8969　298-309　2015

DOI： 10.1007/978-3-319-17353-5_25 　

ISSN： 0302-9743
Optimized Data Transfers Based on the OpenCL Event Management Mechanism Peer-reviewed

Hiroyuki Takizawa, Shoichi Hirasawa, Makoto Sugawara, Isaac Gelado, Hiroaki Kobayashi, Wen-mei W. Hwu

SCIENTIFIC PROGRAMMING　2015　(576498)　1-16　2015

DOI： 10.1155/2015/576498 　

ISSN： 1058-9244

eISSN： 1875-919X
Combining code refactoring and auto-tuning to improve performance portability of high-performance computing applications Peer-reviewed

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

The Sixth International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking (COMPUTATION TOOLS 2015)　20-26　2015
Identification and elimination of platform-specific code smells in high performance computing applications Peer-reviewed

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

International Journal of Networking and Computing　5　(1)　180-199　2015
Publisher: IJNC Editorial Committee
DOI： 10.15803/ijnc.5.1_180 　

ISSN： 2185-2839

More details Close

A code smell is a code pattern that might indicate a code or design problem, which makes the application code hard to evolve and maintain. Automatic detection of code smells has been studied to help users find which parts of their application codes should be refactored. However, code smells have not been defined in a formal manner. Moreover, existing detection tools are designed mainly for object-oriented applications, but rarely provided for high performance computing (HPC) applications. HPC applications are usually optimized for a particular platform to achieve a high performance, and hence have special code smells called platform-specific code smells (PSCSs). The purpose of this work is to develop a code smell alert system to help users find PSCSs of HPC applications to improve the performance portability across different platforms. This paper presents a PSCS alert system that is based on an abstract syntax tree (AST) and XML. Code patterns of PSCSs are defined in a formal way using the AST information represented in XML. XML Path Language (XPath) is used to describe those patterns. A database is built to store the transformation recipes written in XSLT files for eliminating detected PSCSs. The recall and precision evaluation results obtained by using real applications show that the proposed system can detect potential PSCSs accurately. The evaluation on performance portability of real applications demonstrates that eliminating PSCSs leads to significant performance changes and therefore the code portions with detected PSCSs have to be refactored to improve the performance portability across multiple platforms.
Xevolver を用いた自動チューニング

平澤将一, 肖熊, 滝沢寛之, 小林広明

計算工学会学会誌「計算工学」　20　(2)　14-17　2015
An Energy-Efficient Dynamic Memory Address Mapping Mechanism Peer-reviewed

Masayuki Sato, Chengguang Han, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2015 IEEE SYMPOSIUM ON LOW-POWER AND HIGH-SPEED CHIPS　1-3　2015

DOI： 10.1109/CoolChips.2015.7158660 　
A Verification Framework for Streamlining Empirical Auto-tuning Peer-reviewed

Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　508-514　2015

DOI： 10.1109/CANDAR.2015.115 　

ISSN： 2379-1888
Migration of an Atmospheric Simulation Code to an OpenACC Platform Using the Xevolver Framework Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken'ichi Itakura, Hiroaki Kobayashi

PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　515-520　2015

DOI： 10.1109/CANDAR.2015.102 　

ISSN： 2379-1888
Xevtgen: Fortran code transformer generator for high performance scientific codes Peer-reviewed

Reiji Suda, Hiroyuki Takizawa, Shoichi Hirasawa

PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　528-534　2015

DOI： 10.1109/CANDAR.2015.63 　

ISSN： 2379-1888
A Case Study of User-Defined Code Transformations for Data Layout Optimizations Peer-reviewed

Takeshi Yamada, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　535-541　2015

DOI： 10.1109/CANDAR.2015.96 　

ISSN： 2379-1888
Xevtgen: Fortran code transformer generator for high performance scientific codes Peer-reviewed

Reiji Suda, Hiroyuki Takizawa, Shoichi Hirasawa

PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR)　6　(2)　528-534　2015

DOI： 10.1109/CANDAR.2015.63 　

ISSN： 2379-1888
MVP-Cache: A Multi-Banked Cache Memory for Energy-Efficient Vector Processing of Multimedia Applications Peer-reviewed

Ye Gao, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E97D　(11)　2835-2843　2014/11

DOI： 10.1587/transinf.2014EDP7227 　

ISSN： 1745-1361
Early evaluation of the SX-ACE processor Peer-reviewed

Ryusuke Egawa, Shintaro Momose, Kazuhiko Komatsu, Yoko Isobe, Hiroyuki Takizawa, Akihiro Musa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference 2014 (SC14)　1-2　2014/11
ベクトル型メディアプロセッサの低消費電力化に関する研究

宇野渉, 高也, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

電気関係学会東北支部連合大会予稿集　2014/08
キャッシュメモリにおけるスレッド間共有データの管理に関する研究

西村秦, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

電気関係学会東北支部連合大会予稿集　2014/08
Exploring system architectures for next-generation CFD simulations in the postpeta-scale era Peer-reviewed

KOMATSU Kazuhiko, EGAWA Ryusuke, TAKIZAWA Hiroyuki, SOGA Takashi, MUSA Akihiro, KOBAYASHI Hiroaki

Journal of Fluid Science and Technology　9　(5)　JFST0073-JFST0073　2014
Publisher: The Japan Society of Mechanical Engineers
DOI： 10.1299/jfst.2014jfst0073 　

ISSN： 1880-5558

More details Close

CFD simulations with uniform grids have been paid attention as a next-generation CFD simulation on a large-scale supercomputing system. The Building-Cube Method (BCM) is one of the next-generation CFD methods. The basic idea is to balance loads of calculations among processing elements on a supercomputing system by dividing the whole calculations into many parallel tasks with the same amount of computation. Thus, it is suitable for highly parallel computation on supercomputing systems. This paper firstly implements BCM on five supercomputing systems as an example of a next-generation CFD simulation in the upcoming postpeta-scale era. Then, by theoretical analyses and performance evaluations, this paper clarifies the requirements of future supercomputing systems for a next-generation CFD simulation. The performance evaluations show that as the number of processing elements increases, the imbalance of data exchanges among nodes becomes more serious than that of calculations even in a next-generation CFD simulation. While the calculation time can ideally be reduced according to the number of processing elements, the data transfer time becomes dominant in the total execution time. Different from the massively-parallel system architecture, the number of nodes in a system should be as small as possible to prevent the data transfer. The performance analyses also show that the memory bandwidth limits the performance of BCM and use of an on-chip memory is effective to improve the performance. A memory subsystem that achieves a higher sustained memory bandwidth is required. Therefore, a supercomputing system that consists of a small number of high-performance nodes is essential to achieve high sustained performance of the next-generation CFD in the up coming postpeta-scale era by reducing the data transfers, which becomes eventually a bottleneck in large-scale simulation.
A Platform-Specific Code Smell Alert System for High Performance Computing Applications Peer-reviewed

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)　653-662　2014

DOI： 10.1109/IPDPSW.2014.76 　
On-chip checkpointing with 3D-stacked memories Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2014 International 3D Systems Integration Conference, 3DIC 2014 - Proceedings　1-6　2014
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/3DIC.2014.7152173 　
An Energy Optimization Method for Vector Processing Mechanisms Peer-reviewed

Ye Gao, Masayuki Satoi, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2014 IEEE COOL CHIPS XVII　1-3　2014

DOI： 10.1109/CoolChips.2014.6842957 　

ISSN： 2473-4683
A compiler-assisted OpenMP migration method based on automatic parallelizing information Peer-reviewed

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)　8488　450-459　2014
Publisher: Springer Verlag
DOI： 10.1007/978-3-319-07518-1_30 　

ISSN： 1611-3349 0302-9743
An Approach to Customization of Compiler Directives for Application-Specific Code Transformations Peer-reviewed

Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2014 IEEE 8TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANYCORE SOCS (MCSOC)　99-106　2014

DOI： 10.1109/MCSoC.2014.23 　
Xevolver: An XML-based Code Translation Framework for Supporting HPC Application Migration Peer-reviewed

Hiroyuki Takizawa, Shoichi Hirasawa, Yasuharu Hayashi, Ryusuke Egawa, Hiroaki Kobayashi

2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC)　1-11　2014

DOI： 10.1109/HiPC.2014.7116902 　

ISSN： 1094-7256
Xevolver: an XML-based programming framework for software evolution Peer-reviewed

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference 2013 (SC13)　1-2　2013/11
An Automatic Performance Tracking System for Software Evolution Peer-reviewed

Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing Systems　2013/10
A Capacity-Aware Thread Scheduling Method Combined with Cache Partitioning to Reduce Inter-Thread Cache Conflicts Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E96D　(9)　2047-2054　2013/09

DOI： 10.1587/transinf.E96.D.2047 　

ISSN： 1745-1361
ブロックバイパス機構によるキャッシュのエネルギ効率化に関する研究

高井拓実, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

並列/分散/協調処理に関する「北九州」サマー・ワークショップ (SWoPP2013)　1-9　2013/07
Performance Evaluation of a Next-Generation CFD on Various Supercomputing Systems

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2012　123-132　2013
Publisher: Springer Berlin Heidelberg
DOI： 10.1007/978-3-642-32454-3_11 　
Analysing the performance improvements of optimizations on modern HPC systems Peer-reviewed

Kazuhiko Komatsu, Toshihide Sasaki, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Sustained Simulation Performance 2013 - Proceedings of the Joint Workshop on Sustained Simulation Performance　13-25　2013
Publisher: Springer Science and Business Media, LLC
DOI： 10.1007/978-3-319-01439-5-2 　
HPC refactoring with hierarchical abstractions to help software evolution Peer-reviewed

Hiroyuki Takizawa, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda

Sustained Simulation Performance 2012 - Proceedings of the Joint Workshop on High Performance Computing on Vector Systems, and Workshop on Sustained Simulation Performance　27-33　2013
Publisher: Springer Science and Business Media, LLC
DOI： 10.1007/978-3-642-32454-3-3 　
Performance evaluation of phase-based correspondence matching on GPUs Peer-reviewed

Mamoru Miura, Kinya Fudano, Koichi Ito, Takafumi Aoki, Hiroyuki Takizawa, Hiroaki Kobayashi

APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXVI　8856　2013

DOI： 10.1117/12.2023550 　

ISSN： 0277-786X

eISSN： 1996-756X
Checkpoint-Restart for Heterogeneous Computing Systems Invited

滝沢寛之, 佐藤雅之, 江川隆輔, 小林広明

Reliability Engineering Association of Japan　35　(8)　515　2013

DOI： 10.11348/reajshinrai.35.8_515 　
A Flexible Insertion Policy for Dynamic Cache Resizing Mechanisms Peer-reviewed

Masayuki Sato, Yusuke Tobo, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2013 IEEE COOL CHIPS XVI (COOL CHIPS)　1-3　2013

DOI： 10.1109/CoolChips.2013.6547923 　

ISSN： 2473-4683
ClMPI: An opencl extension for interoperation with the message passing interface Peer-reviewed

Hiroyuki Takizawa, Makoto Sugawara, Shoichi Hirasawa, Isaac Gelado, Hiroaki Kobayashi, Wen-Mei W. Hwu

Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013　1138-1148　2013
Publisher: IEEE Computer Society
DOI： 10.1109/IPDPSW.2013.183 　
A comparison of performance tunabilities between OpenCL and OpenACC Peer-reviewed

Makoto Sugawara, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings - IEEE 7th International Symposium on Embedded Multicore/Manycore System-on-Chip, MCSoC 2013　147-152　2013
Publisher: IEEE Computer Society
DOI： 10.1109/MCSoC.2013.31 　
Design and Evaluation of a Media-oriented Vector Processor with a Multi-banked Cache Memory Peer-reviewed

Ye Gao, Naold Shoji, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2013 IEEE 11TH SYMPOSIUM ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA (ESTIMEDIA)　78-87　2013

DOI： 10.1109/ESTIMedia.2013.6704506 　

ISSN： 2325-1271
Performance evaluation of BCM on various supercomputing systems Peer-reviewed

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

The 24th International Conference on Parallel Computational Fluid Dynamics　1-2　2012/11
Performance Evaluation of BCM on Various Supercomputing Systems Peer-reviewed

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

In 24th International Conference on Parallel Computational Fluid Dynamics　2012/05/21
ウェイ適応型キャッシュの高エネルギ効率化のためのデッドブロック早期追い出しポリシ Peer-reviewed

東方雄亮, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

先進的計算基盤シンポジウムSACSIS2012　4-5　2012/05
A Self-Organizing Overlay Network Mechanism Spreading Meta-Information of Resources Based on Users' Locality of Interests for Efficient Resource Discovery Peer-reviewed

Tsutomu Inaba, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE trans. info. & syst.　J95-D　(5)　1110-1122　2012/05/01
Publisher: The Institute of Electronics, Information and Communication Engineers
ISSN： 1880-4535
A bypass mechanism for way-adaptable caches Peer-reviewed

Takumi Takai, Yusuke Tobo, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE COOL Chips XV　2012/04
Performance and scalability analysis of a chip multi vector processor Peer-reviewed

Yoshiei Sato, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

High Performance Computing on Vector Systems 2011　3-20　2012
Publisher: Springer Science and Business Media, LLC
DOI： 10.1007/978-3-642-22244-3-1 　
A prototype implementation of OpenCL for SX vector systems Peer-reviewed

Hiroyuki Takizawa, Ryusuke Egawa, Hiroaki Kobayashi

High Performance Computing on Vector Systems 2011　41-50　2012
Publisher: Springer Science and Business Media, LLC
DOI： 10.1007/978-3-642-22244-3-3 　
Exploring Design Space of a 3D Stacked Vector Cache Peer-reviewed

Ryusuke Egawa, Yusuke Endo, Jubee Tada, Hiroyuki Takizawa, Hiroaki Kobayashi

2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC)　1477-1477　2012
A capacity-efficient insertion policy for dynamic cache resizing mechanisms Peer-reviewed

Masayuki Sato, Yusuke Tobo, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

CF '12 - Proceedings of the ACM Computing Frontiers Conference　265-267　2012

DOI： 10.1145/2212908.2212949 　
A media-oriented vector architectural extension with a high bandwidth cache system Peer-reviewed

Ye Gao, Naoki Shoji, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Symposium on Low-Power and High-Speed Chips - Proceedings for 2012 IEEE COOL Chips XV　1-3　2012

DOI： 10.1109/COOLChips.2012.6216588 　
An out-of-order vector processing mechanism for multimedia applications Peer-reviewed

Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

CF '12 - Proceedings of the ACM Computing Frontiers Conference　233-235　2012

DOI： 10.1145/2212908.2212941 　
GPU IMPLEMENTATION OF PHASE-BASED STEREO CORRESPONDENCE AND ITS APPLICATION Peer-reviewed

Mamoru Miura, Kinya Fudano, Koichi Ito, Takafumi Aoki, Hiroyuki Takizawa, Hiroaki Kobayashi

2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012)　1697-1700　2012

DOI： 10.1109/ICIP.2012.6467205 　

ISSN： 1522-4880
Improving the Scalability of Transparent Checkpointing for GPU Computing Systems Peer-reviewed

Alfian Amrizal, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

TENCON 2012 - 2012 IEEE REGION 10 CONFERENCE: SUSTAINABLE DEVELOPMENT THROUGH HUMANITARIAN TECHNOLOGY　1-6　2012

DOI： 10.1109/TENCON.2012.6412343 　

ISSN： 2159-3442
Exploring Design Space of a 3D Stacked Vector Cache Peer-reviewed

Ryusuke Egawa, Yusuke Endo, Hiroyuki Takizawa, Hiroaki Kobayashi, Jubee Tada

2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC)　1475-+　2012
A Network Clustering Algorithm for Sybil-Attack Resisting Peer-reviewed

Ling Xu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E94D　(12)　2345-2352　2011/12

DOI： 10.1587/transinf.E94.D.2345 　

ISSN： 0916-8532

eISSN： 1745-1361
Performance of building cube method on various platforms Peer-reviewed

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

The 8th International Conference on Flow Dynamics 2011 (ICFD2011)　2011/11
An automatic task assignment method for heterogeneous computing systems Peer-reviewed

Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

The 8th International Conference on Flow Dynamics 2011 (ICFD2011)　2011/11
Job Scheduling with Migration for Heterogeneous Computing Systems Peer-reviewed

Kentaro Koyama, Katsuto Sato, Kazuhiko Komatsu, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing Systems　4　(4)　203-213　2011/10/05
Publisher:
ISSN： 1882-7829
A patch-based bit mask ltering method for micropolygon rasterization Peer-reviewed

Jiali Yao, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

High-Performance Graphics (HPG2011)　2011/08
Performance of SOR methods on modern vector and scalar processors Peer-reviewed

Takashi Soga, Akihiro Musa, Koki Okabe, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

COMPUTERS & FLUIDS　45　(1)　215-221　2011/06

DOI： 10.1016/j.compfluid.2010.12.024 　

ISSN： 0045-7930
Parallel processing of the Building-Cube Method on a GPU platform Peer-reviewed

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

COMPUTERS & FLUIDS　45　(1)　122-128　2011/06

DOI： 10.1016/j.compfluid.2010.12.019 　

ISSN： 0045-7930
A Performance Tuning Strategy Based on the Roofline Model for Vector Processors Peer-reviewed

Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing Systems　4　(3)　77-87　2011/05/12

ISSN： 1882-7772
ウェイ適応型キャッシュのための低消費エネルギ指向挿入ポリシ Peer-reviewed

東方雄亮, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

先進的計算基盤シンポジウムSACSIS2011　2011　213-214　2011/05
Power-aware insertion policy for the way-adaptable caches Peer-reviewed

Yusuke Tobo, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE COOL Chips XIV　2011/04
Energy Consumption of a Chip Multi-Vector Processor Using Real Applications

永岡龍一, 佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 小林広明

情報処理学会研究報告(CD-ROM)　2010　(5)　ROMBUNNO.ARC-192,NO.3　2011/02/15

ISSN： 2186-2583
A High-performance Volunteer Computing Environment with a Dynamic Load-balancing Mechanism Peer-reviewed

Yoshitomo Murata, Yuki Ishimori, Hiroyuki Takizawa, Hiroaki Kobayashi

Transactions of Information Processing Society of Japan　52　(2)　401-414　2011/02/15
Publisher:
ISSN： 1882-7837
Performance Evaluation of Real-Time Stereo Correspondence on GPU

Tohoku-Section Joint Convention Record of Institutes of Electrical and Information Engineers, Japan　2011　31-31　2011
Publisher: Organizing Committee of Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers, Japan
DOI： 10.11528/tsjc.2011.0_31 　
A Self-Organized Overlay Network Management Mechanism for Heterogeneous Environments

Inaba Tsutomu, Takizawa Hiroyuki, Kobayashi Hiroaki

Information and Media Technologies　6　(2)　546-559　2011
Publisher: Information and Media Technologies Editorial Board
DOI： 10.11185/imt.6.546 　

More details Close

The technologies of Cloud Computing and NGN are now growing a paradigm shift where various services are provided to business users over the network. In conjunction with this movement, many studies are active to realize a ubiquitous computing environment in which a huge number of individual users can share their computing resources on the Internet, such as personal computers (PCs), game consoles, sensors and so on. To realize an effective resource discovery mechanism for such an environment, this paper presents an adaptive overlay network that enables a self-organizing resource management system to efficiently adapt to a heterogeneous environment. The proposed mechanism is composed of two functions. One is to adjust the number of logical links of a resource, which forward search queries so that less-useful query flooding can be reduced. The other is to connect resources so as to decrease the communication latency on the physical network rather than the number of query hops on an overlay network. To further improve the discovery efficiency, this paper integrates these functions into a self-organizing resource management system, SORMS, which has been proposed in our previous work. The simulation results indicate that the proposed mechanism can increase the number of discovered resources by 60% without decreasing the discovery efficiency, and can reduce the total communication traffic by 80% compared with the original SORMS. This performance improvement is obtained by efficient control of logical links in a large scale network.
NVCR: A transparent checkpoint-restart library for NVIDIA CUDA Peer-reviewed

Akira Nukada, Hiroyuki Takizawa, Satoshi Matsuoka

IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum　104-113　2011

DOI： 10.1109/IPDPS.2011.131 　
Power-aware dynamic cache partitioning for CMPs Peer-reviewed

Isao Kotera, Kenta Abe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)　6590　135-153　2011

DOI： 10.1007/978-3-642-19448-1_8 　

ISSN： 0302-9743 1611-3349
OpenCLにおけるタスク並列化支援のための実行時依存関係解析手法 Peer-reviewed

佐藤功人, 小松一彦, 滝沢寛之, 小林広明

情報処理学会論文誌コンピューティングシステム(ACS)　5　(1)　53-67　2011/01
A Runtime Dependency Analysis Method for Task Parallelization of OpenCL Programs Peer-reviewed

Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing Systems　4　(5)　2011
A self-organized overlay network management mechanism for heterogeneous environments Peer-reviewed

Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

Journal of Information Processing　19　(0)　25-38　2011
Publisher: Information Processing Society of Japan
DOI： 10.2197/ipsjjip.19.25 　

ISSN： 1882-6652 0387-5806
A history-based performance prediction model with profile data classification for automatic task allocation in heterogeneous computing systems Peer-reviewed

Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings - 9th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2011　135-142　2011

DOI： 10.1109/ISPA.2011.36 　
Effects of 3-D stacked vector cache on energy consumption Peer-reviewed

Ryusuke Egawa, Yusuke Funaya, Ryuichi Nagaoka, Yusuke Endo, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

2011 IEEE International 3D Systems Integration Conference, 3DIC 2011　2011

DOI： 10.1109/3DIC.2012.6263026 　
CheCL: Transparent checkpointing and process migration of OpenCL applications Peer-reviewed

Hiroyuki Takizawa, Kentaro Koyama, Katsuto Sato, Kazuhiko Komatsu, Hiroaki Kobayashi

Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011　864-876　2011

DOI： 10.1109/IPDPS.2011.85 　
A performance tuning strategy under combining loop transforms for a vector processor with an on-chip cache Peer-reviewed

Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC10)　2010/11
Evaluating Performance and Portability of OpenCL Programs Peer-reviewed

Kazuhiko Komatsu, Katsuto Sato, Yusuke Arai, Kentaro Koyama, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of the 5th international Workshop on Automatic Performance Tuning　1-15　2010/06
Automatic tuning of CUDA execution parameters for stencil processing Peer-reviewed

Katsuto Sato, Hiroyuki Takizawa, Kazuhiko Komatsu, Hiroaki Kobayashi

Software Automatic Tuning: From Concepts to State-of-the-Art Results　209-228　2010
Publisher: Springer New York
DOI： 10.1007/978-1-4419-6935-4_13 　
Lessons Learned from 1-Year Experience with SX-9 and Toward the Next Generation Vector Computing Peer-reviewed

Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiko Musa, Takashi Soga, Yoko Isobe

HIGH PERFORMANCE COMPUTING ON VECTOR SYSTEMS 2009　3-+　2010

DOI： 10.1007/978-3-642-03913-3_1 　
Cache partitioning strategies for 3-D stacked vector processors Peer-reviewed

Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE 3D System Integration Conference 2010, 3DIC 2010　1-6　2010

DOI： 10.1109/3DIC.2010.5751453 　
A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications Peer-reviewed

Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

13TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN: ARCHITECTURES, METHODS AND TOOLS　412-415　2010

DOI： 10.1109/DSD.2010.93 　
Efficient data management for the building cube method using cartesian meshes on the GPU platform Peer-reviewed

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

International Supercomputing Conference (ISC10)　2010
A majority-based control scheme for way-adaptable caches Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)　6310　16-28　2010

DOI： 10.1007/978-3-642-16233-6_5 　

ISSN： 0302-9743 1611-3349
Resisting sybil attack by social network and network clustering Peer-reviewed

Ling Xu, Satayapiwat Chainan, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings - 2010 10th Annual International Symposium on Applications and the Internet, SAINT 2010　15-21　2010

DOI： 10.1109/SAINT.2010.32 　
A Voting-Based Working Set Assessment Scheme for Dynamic Cache Resizing Mechanisms Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2010 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN　98-105　2010

DOI： 10.1109/ICCD.2010.5647599 　

ISSN： 1063-6404
Design and early evaluation of a 3-D die stacked chip multi-vector processor Peer-reviewed

Ryusuke Egawa, Yusuke Funaya, Ryu-Ichi Nagaoka, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE 3D System Integration Conference 2010, 3DIC 2010　1-8　2010

DOI： 10.1109/3DIC.2010.5751448 　
Performance of hemisphere algorithm for fast form factor calculation Peer-reviewed

Noboru Yamada, Tomoaki Shinoda, Hiroyuki Takizawa

Heat Transfer - Asian Research　38　(7)　450-463　2009/11

DOI： 10.1002/htj.20259 　

ISSN： 1099-2871 1523-1496
Performance Optimization Techniques for Vector Processors with Cache Memory

佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

情報処理学会研究報告(CD-ROM)　2009　(3)　ROMBUNNO.ARC-184,6　2009/10/15

ISSN： 2186-2583
Working Sets based Thread Scheduling with Cache Partitioning Peer-reviewed

Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Poster Abstracts of The Eighteenth International Conference on Parallel Architecture and Compilation Techniques　12　2009/09
ワーキングセット評価に基づくスレッドスケジューリング

佐藤雅之, 小寺功, 江川隆輔, 滝沢寛之, 小林広明

並列/分散/協調処理に関する「仙台」サマー・ワークショップ (SWoPP仙台2009)　1-10　2009/08
Early evaluation of a memory-stacked vector processor Peer-reviewed

Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE COOL Chips XII　165　2009/04
A Cache-Aware Thread Scheduling Policy for Multi-Core Processors Peer-reviewed

Masayuki Sato, Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

The IASTED International Conference on Parallel and Distributed Computing and Networks　2009/02
実アプリケーションによるSX‐9の性能評価

曽我隆, 下村陽一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

情報処理学会シンポジウム論文集　2009　(2)　57-64　2009/01/15

ISSN： 1344-0640
Evaluating Computational Performance of Backpropagation Learning on Graphics Hardware Peer-reviewed

Hiroyuki Takizawa, Tatsuya Chida, Hiroaki Kobayashi

Electronic Notes in Theoretical Computer Science　225　(C)　379-389　2009/01/02

DOI： 10.1016/j.entcs.2008.12.087 　

ISSN： 1571-0661
3D On-Chip Memory for the Vector Architecture Peer-reviewed

Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2009 IEEE INTERNATIONAL CONFERENCE ON 3D SYSTEMS INTEGRATION　352-357　2009

ISSN： 2164-0157
Performance of Hemisphere Algorithm for Fast Form Factor Calculation Peer-reviewed

Noboru YAMADA, Tomoaki SHINODA, Hiroyuki TAKIZAWA

Transactions of the Japan Society of Mechanical Engineers B　075　(749)　132-139　2009/01
Publisher: The Japan Society of Mechanical Engineers
DOI： 10.1299/kikaib.75.749_132 　

ISSN： 0387-5016

More details Close

Development of fast and accurate algorithm of radiative heat transfer simulation is important in terms of efficient thermal design and simulation on diverse engineering area. This paper describes the performance of Hemisphere algorithm which has originally developed as a fast form factor calculation in the field of photorealistic three-dimensional computer graphics. We compared performance of the Hemisphere algorithm with two conventional methods which are frequently used in the field of radiative heat transfer simulation. As a result, the Hemisphere algorithm is significant faster than the conventional methods if one can accept an absolute error of 1.0×10^<-5>. In addition, the result indicates that the Hemisphere algorithm possibly suit for try and error process of large-scale model simulation due to its tolerable form factor distribution.
Characteristics of an On-Chip Cache on NEC SX Vector Architecture Peer-reviewed

Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Interdisciplinary Information Sciences　15　(1)　51-66　2009/01
Publisher: Graduate School of Information Sciences, Tohoku University
DOI： 10.4036/iis.2009.51 　

ISSN： 1340-9050

More details Close

Thanks to the highly effective memory bandwidth of the vector systems, they can achieve the high computation efficiency for computation-intensive scientific applications. However, they have been encountering the memory wall problem and the effective memory bandwidth rate has decreased, resulting in the decrease in the bytes per flop rates of recent vector systems from 4 (SX-7 and SX-8) to 2 (SX-8R) and 2.5 (SX-9). The situation is getting worse as many functions units and/or cores will be brought into a single chip, because the pin bandwidth is limited and does not scale. To solve the problem, we propose an on-chip cache, called vector cache, to maintain the effective memory bandwidth rate of future vector supercomputers. The vector cache employs a bypass mechanism between the main memory and register files under software controls. We evaluate the performance of the vector cache on the NEC SX vector processor architecture with bytes per flop rates of 2 B/FLOP and 1 B/FLOP, to clarify the basic characteristics of the vector cache. For the evaluation, we use the NEC SX-7 simulator extended with the vector cache mechanism. Benchmark programs for performance evaluation are two DAXPY-like loops and five leading scientific applications. The results indicate that the vector cache boosts the computational efficiencies of the 2 B/FLOP and 1 B/FLOP systems up to the level of the 4 B/FLOP system. Especially, in the case where cache hit rates exceed 50%, the 2 B/FLOP system can achieve a performance comparable to the 4 B/FLOP system. The vector cache with the bypass mechanism can provide the data both from the main memory and the cache simultaneously. In addition, from the viewpoints of designing the cache, we investigate the impact of cache associativity on the cache hit rate, and the relationship between cache latency and the performance. The results also suggest that the associativity hardly affects the cache hit rate, and the effects of the cache latency depend on the vector loop length of applications. The cache shorter latency contributes to the performance improvement of the applications with shorter loop lengths, even in the case of the 4 B/FLOP system. In the case of longer loop lengths of 256 or more, the latency can effectively be hidden, and the performance is not sensitive to the cache latency. Finally, we discuss the effects of selective caching using the bypass mechanism and loop unrolling on the vector cache performance for the scientific applications. The selective caching is effective for efficient use of the limited cache capacity. The loop unrolling is also effective for the improvement of performance, resulting in a synergistic effect with caching. However, there are exceptional cases; the loop unrolling worsens the cache hit rate due to an increase in the working space to process the unrolled loops over the cache. In this case, an increase in the cache miss rate cancels the gain obtained by unrolling.
Performance tuning and analysis of future vector processors based on the roofline model Peer-reviewed

Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

ACM International Conference Proceeding Series　7-14　2009

DOI： 10.1145/1621960.1621962 　
CheCUDA: A Checkpoint/Restart Tool for CUDA Applications Peer-reviewed

Hiroyuki Takizawa, Katsuto Sato, Kazuhiko Komatsu, Hiroaki Kobayashi

2009 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT 2009)　408-+　2009

DOI： 10.1109/PDCAT.2009.78 　
Performance Evaluation of NEC SX-9 using Real Science and Engineering Applications Peer-reviewed

Takashi Soga, Akihiro Musa, Youichi Shimomura, Ken'ichi Itakura, Koki Okabe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS　2009

DOI： 10.1145/1654059.1654088 　
Auction-based Resource Allocation for Activating Incentives in Resource Trading in Grid Computing Peer-reviewed

Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of The 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications　252-260　2008/12
Caching on a chip multi vector processor Peer-reviewed

Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC08)　2008/11
SPRAT: A Stream Programming Language with Runtime Auto-Tuning Peer-reviewed

Hiroyuki Takizawa, Hiroki Shiratori, Katuto Sato, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing System　1　(2)　207-220　2008/08
A Reliability Model for Result Checking in Volunteer Computing Peer-reviewed

Ling Xu, Hiroyuki Takizawa, Hiroaki Kobayashi

SAINT2008　201-204　2008/07

DOI： 10.1109/SAINT.2008.25 　
A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination Peer-reviewed

Kazuhiro Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Transactions on Advanced Computing System　1　(1)　85-95　2008/04
A Distributed and Cooperative Load Balancing Method for Large-Scale Computing Environments Peer-reviewed

Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

IPSJ Journal　49　(3)　1214-1228　2008/03
A Parallel Image Generation Algorithm based on Photon Map Partitioning Peer-reviewed

Masahide Tamura, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of the 10th IASTED International Conference on Computer Graphics and Imaging (CGIM 2008)　145-151　2008/02
An Efficient Intersection Algorithm Design of Ray Tracing For Many-Core Graphics Processors Peer-reviewed

Kazuhiko Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of the 10th IASTED International Conference on Computer Graphics and Imaging (CGIM 2008)　165-171　2008/02
First Experiences with NEC SX-9.

Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiko Musa, Takashi Soga, Yoichi Shimomura

High Performance Computing on Vector Systems　3-11　2008
Publisher: Springer
DOI： 10.1007/978-3-540-85869-0_1 　
Modeling of cache access behavior based on Zipf's law Peer-reviewed

Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT　310　9-15　2008

DOI： 10.1145/1509084.1509086 　

ISSN： 1089-795X
A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination Peer-reviewed

Komatsu Kazuhiko, Kaeriyama Yoshiyuki, Suzuki Kenichi, Takizawa Hiroyuki, Kobayashi Hiroaki

IPSJ Online Transactions　1　(1)　1-11　2008
Publisher: Information Processing Society of Japan
DOI： 10.2197/ipsjtrans.1.1 　

ISSN： 1882-6660

More details Close

Although ray tracing is the best approach to high-quality image synthesis, much time is required to generate images due to its huge amount of computation. In particular, ray-primitive intersection tests still dominate the execution time required for ray tracing, and faster ray-primitive intersection algorithms are strongly required to interactively generate higher-quality images with more advanced effects. This paper presents a new fast algorithm for the intersection tests that makes a good use of ray and object coherence in ray tracing. The proposed algorithm utilizes the features whereby the rays in a bundle share the same origin and have massive coherence. By reducing the redundant calculations in the innermost intersection tests for the bundles by precomputation and early termination, the proposed algorithm accelerates the intersection tests. Experimental results show that the proposed algorithm achieves 1.43 times faster intersection tests compared with Möller's algorithm by exploiting the features of the bundles of rays.
The potential of on-chip memory systems for future vector architectures Peer-reviewed

Hiroaki Kobayashi, Akihiko Musa, Yoshiei Sato, Hiroyuki Takizawa, Koki Okabe

HIGH PERFORMANCE COMPUTING ON VECTOR SYSTEMS 2007　247-+　2008
A Utility-based Double Auction Mechanism for Efficient Grid Resource Allocation Peer-reviewed

Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS　252-260　2008

DOI： 10.1109/ISPA.2008.103 　
SPRAT: Runtime Processor Selection for Energy-aware Computing Peer-reviewed

Hiroyuki Takizawa, Katuto Sato, Hiroaki Kobayashi

2008 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING　386-393　2008

ISSN： 1552-5244
A Performance Study of Secure Data Mining on the Cell Processor Peer-reviewed

Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS　633-+　2008

DOI： 10.1109/CCGRID.2008.16 　
Implementation and Evaluation of a Distributed and Cooperative Load-Balancing Mechanism for Dependable Volunteer Computing Peer-reviewed

Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

2008 IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS & NETWORKS WITH FTCS & DCC　316-+　2008

DOI： 10.1109/DSN.2008.4630100 　

ISSN： 1530-0889
Consideration of resource access history for optimizing overlay networks in P2P-based resource discovery Peer-reviewed

Tsutomu Inaba, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayash

Proceedings - 2008 International Symposium on Applications and the Internet, SAINT 2008　269-272　2008

DOI： 10.1109/SAINT.2008.104 　
SPRAT: Runtime processor selection for energy-aware computing Peer-reviewed

Hiroyuki Takizawa, Katuto Sato, Hiroaki Kobayashi

Proceedings - IEEE International Conference on Cluster Computing, ICCC　2008　386-393　2008
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI： 10.1109/CLUSTR.2008.4663799 　

ISSN： 1552-5244
A shared cache for a chip multi vector processor Peer-reviewed

Akihiro Musa, Yoshiei Sato, Takashi Soga, Koki Okabe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT　310　24-29　2008

DOI： 10.1145/1509084.1509088 　

ISSN： 1089-795X
Effects of MSHR and Prefetch Mechanisms on an On-Chip Cache of the Vector Architecture Peer-reviewed

Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS　335-+　2008

DOI： 10.1109/ISPA.2008.100 　
A Progressive 3-D Meshing Algorithm for Interactive Simulation of Soft Bodie Peer-reviewed

SAOI Tomoyuki, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

Journal of INFORMATION　10　(6)　761-776　2007/12
A dependable Peer-to-Peer computing platform Peer-reviewed

Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE　23　(8)　939-955　2007/11

DOI： 10.1016/j.future.2007.03.004 　

ISSN： 0167-739X

eISSN： 1872-7115
Early evaluation of on-chip vector caching for the NEC SX vector architecture Peer-reviewed

Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC07)　2007/11
An Efficient Control Mechanism for Self-Organizing Overlay Networks of Large-Scale P2P Systems Peer-reviewed

Hiroaki Kobayashi, Hiroyuki Takizawa, Takuro Okawa, Tsutomu Inaba

Interdisciplinary Information Sciences　13　(2)　227-237　2007/09/18
Publisher: Tohoku University
DOI： 10.4036/iis.2007.227 　

ISSN： 1340-9050

More details Close

P2P (Peer to Peer) has a great potential to handle highly-distributed computing resources and is expected to be a key technology to realize ubiquitous computing environments over the Internet. However, P2P systems tend to waste the network bandwidth for resource acquisition because of their decentralized resource management. This paper presents an efficient control mechanism for self-organizing overlay networks of large-scale P2P systems, and evaluate its performance in detail. The overlay network is configured by making local clusters reflect current interests of individual peers and connecting them together based on their similarity. As a result, the overlay network provides the resource exploitation space for some specific interests. In addition, the overlay network can dynamically be reconfigured based on the change in the interests of individual peers across time so that more useful peers at that time can be reconnected closer to their client peers. Therefore, multicasting of resource requesting messages can be carried out only over peers with similar interests that are dynamically connected through the overlay network, resulting in a remarkable decrease in both messages for resource acquisition and hops a resource requesting query travels to reach the peer that satisfies the request. Experimental results indicate that the proposed mechanism can realize effective self-organization of the overlay network in which useful peers are dynamically relocated around client peers. In addition, the adaptive allocation of links to peers according to their capability works well to keep the higher performance and fault-tolerance of the self-organizing overlay network.
A Power-Aware Shared Cache Mechanism ased on Locality Assessment of memory Reference for CMPs Peer-reviewed

Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of the 8th MEDEA workshop　121-128　2007/09/16
Analysis of hardware resource conflicts for runtime performance prediction of SMT processors Peer-reviewed

Masayuki Sato, Yusuke Funaya, Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

Information Technology Letters　6　67-70　2007/09/05
A Power-Aware and Way-Allocatable Shared Cache Mechanism Peer-reviewed

Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

Information Technology Letters　6　55-58　2007/09/05
Partial distortion entropy maximization for online data clustering Peer-reviewed

Hiroyuki Takizawa, Hiroaki Kobayashi

NEURAL NETWORKS　20　(7)　819-831　2007/09

DOI： 10.1016/j.neunet.2007.04.029 　

ISSN： 0893-6080
An Estimation-Based Redundant Task Dispatch Policy for Volunteer Computing Platforms Peer-reviewed

Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of the International Conference on Dependable Systems and Networks　348-349　2007/06/25

More details Close

Fast Abstract (Supplemental Volume)
A fair-sharing and power-aware L2 cache system for chip multiprocessors Peer-reviewed

Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE COOL Chips X　2007/04
A power-aware shared cache mechanism based on locality assessment of memory reference for CMPs Peer-reviewed

Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT　113-120　2007

DOI： 10.1145/1327171.1327185 　

ISSN： 1089-795X
Preliminary evaluation for runtime auto-tuning of GPGPU applications Peer-reviewed

Hiroyuki Takizawa, Hiroki Shiratori, Hiroaki Kobayashi

The 2nd International Workshop on Automatic Performance Tuning　37-37　2007
Performance Evaluation of K-Means Clustering on the Cell Processor Peer-reviewed

Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

Proceedings of High Performance Computing Symposium　2007　(1)　161-168　2007/01
A memory-efficient scheme for fast spectral photon mapping Peer-reviewed

Kosuke Ikeda, Hiroyuki Takizawa, Hiroaki Kobayashi

PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS AND IMAGING　75-80　2007
An on-chip cache design for vector processors Peer-reviewed

Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT　17-23　2007

DOI： 10.1145/1327171.1327173 　

ISSN： 1089-795X
An estimation-based redundant task dispatch policy for volunteer computing platforms Peer-reviewed

Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

The International Conference on Dependable Systems and Networks　348-349　2007
A Dynamic Logical Link Management Mechanism for P2P Resource Discovery System Peer-reviewed

Takuro Okawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Information Technology Letters　5　363-366　2006/09
Publisher: Forum on Information Technology
Thread Scheduling Based on the Thread Characteristics for Multi-Core Processors Peer-reviewed

Yusuke Funaya, Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

Information Technology Letters　5　37-40　2006/09
Towards Effective GPU Implementation of Neural Networks Peer-reviewed

Hiroyuki Takizawa, Tatsuya Chida, Hiroaki Kobayashi

The 4th International Conference on Information-MFCSIT’06　408-411　2006/08
Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing Peer-reviewed

H Takizawa, H Kobayashi

JOURNAL OF SUPERCOMPUTING　36　(3)　219-234　2006/06

DOI： 10.1007/s11227-006-8294-1 　

ISSN： 0920-8542
Design and Implementation of an Efficient Search Mechanism based on the Hybrid P2P Model for Ubiquitous Computing Systems Peer-reviewed

T Inaba, T Okawa, Y Murata, H Takizawa, H Kobayashi

INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET , PROCEEDINGS　45-+　2006

DOI： 10.1109/SAINT.2006.23 　
A distributed and cooperative load balancing mechanism for large-scale P2P systems Peer-reviewed

Y Murata, T Inaba, H Takizawa, H Kobayashi

INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET WORKSHOPS, PROCEEDINGS　126-129　2006

DOI： 10.1109/SAINT-W.2006.2 　
Radiative heat transfer simulation using programmable graphics hardware Peer-reviewed

Hiroyuki Takizawa, Noboru Yamada, Seigo Sakai, Hiroaki Kobayashi

Proceedings - 5th IEEE/ACIS Int. Conf. on Comput. and Info. Sci., ICIS 2006. In conjunction with 1st IEEE/ACIS, Int. Workshop Component-Based Software Eng., Softw. Archi. and Reuse, COMSAR 2006　2006　29-37　2006

DOI： 10.1109/ICIS-COMSAR.2006.70 　
Implications of memory performance for highly efficient supercomputing of scientific applications Peer-reviewed

Akihiro Musa, Hiroyuki Takizawa, Koki Okabe, Takashi Soga, Hiroaki Kobayashi

PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS　4330　845-+　2006

ISSN： 0302-9743
Evaluation and Modeling of Resource Discovery in Large Scale P2P Systems Peer-reviewed

Takurou Okawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Forum of Information Technology(FIT2005) Information Technology Letters　4　21-24　2005/09
Publisher: Forum on Information Technology
Performance Evaluation of the SX-7 System Using the HPC Challenge Benchmark Peer-reviewed

Hiroyuki Takizawa, Tatsunobu Kokubo, Kenryo Kataumi, Hiroaki Kobayashi

IPSJ journal　46　(SIG 12(ACS 11))　37-45　2005/08

More details Close

Also presented at SASCIS2005(May 2005)
An Incremental Photon-Mapping Algorithm for Fast Walk-Through Animations Peer-reviewed

Kosuke Ikeda, Hiroyuki Takizawa, Hiroaki Kobayashi

Computer Graphics and Imaging (CGIM 2005)　2005/08
Locality Analysis to Control Dynamically Way-Adaptable Caches Peer-reviewed

KOBAYASHI Hiroaki, KOTERA Isao, TAKIZAWA Hiroyuki

ACM SIGARCH Computer Architecture News　33　(3)　25-32　2005/06

DOI： 10.1145/1101868.1101874 　
Evaluation of Large-Scale Remote Interactive Visualization via Super SINET Peer-reviewed

TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

Journal of INFORMATION　8　(3)　383-390　2005/05
Performance Evaluation of the SX-7 System Using the HPC Challenge Benchmark Peer-reviewed

Hiroyuki Takizawa, Tatsunobu Kokubo, Kenryo Kataumi, Hiroaki Kobayashi

Symposium on Advanced Computing Systems and Infrastructures(SACSIS2005)　2005　(5)　25-33　2005/05
A distributed cooperative scheduling mechanism for P2P computing

Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

Advanced Network & Computing Technology Workshop　(33)　23-30　2005/01/24
A self-organizing overlay network to exploit the locality of interests for effective resource discovery in P2P systems Peer-reviewed

H Kobayashi, H Takizawa, T Inaba, Y Takizawa

2005 SYMPOSIUM ON APPLICATIONS AND THE INTERNET, PROCEEDINGS　246-255　2005
A P2P Semantic Information Search Mechanism for Ubiquitous Grid Computing Systems

Tsutomu Inaba, Takuro Okawa, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

Advanced Network & Computing Technology Workshop　(33)　45-52　2005/01
A workflow management mechanism for peer-to-peer computing platforms Peer-reviewed

H Wang, H Takizawa, H Kobayashi

PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS　3758　827-832　2005

ISSN： 0302-9743
Efficient parallel processing of competitive learning algorithms Peer-reviewed

K Sano, S Momose, H Takizawa, H Kobayashi, T Nakamura

PARALLEL COMPUTING　30　(12)　1361-1383　2004/12

DOI： 10.1016/j.parco.2004.10.001 　

ISSN： 0167-8191

eISSN： 1872-7336
Evaluation of Large-Scale Remote Interactive Visualization via Super SINET Peer-reviewed

TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

The 3rd International Conference on Information (INFO'2004)　3　2004/11
スーパーSINETを介した大規模遠隔対話的可視化の評価実験

滝沢寛之, 小林広明

全国共同利用情報基盤センター研究開発論文集　26　24-29　2004/11
An Effective Control Mechanism for Way-Adaptable Caches

KOTERA Isao, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

電気関係学会東北支部連合大会　2004/08
スーパーSINETを利用した大規模遠隔可視化処理の評価

滝沢寛之, 小林広明

東北大学情報シナジーセンター年報　3　90-96　2004/06
Publisher:
グリッドミドルウェアGlobusの資源探索と通信に関するオーバヘッドの定量的評価

村田善智, 稲葉勉, 滝沢寛之, 小林広明

東北大学情報シナジーセンター年報　3　115-123　2004/06
Publisher:
An Effective Implementation of Vector Quantization Encoder on Commodity Graphics Hardware Peer-reviewed

Hiroyuki TAKIZAWA, Hiroaki KOBAYASHI

Proceedings of the 2nd International Conference on Information Technology and Applications(ICITA2004)　2004/01
A fast computation scheme of partial distortion entropy updating Peer-reviewed

H Takizawa, F Kobayashi

ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 1, PROCEEDINGS　1　736-741　2004

DOI： 10.1109/ITCC.2004.1286555 　
Multi-grain parallel processing of data-clustering on programmable graphics hardware Peer-reviewed

H Takizawa, H Kobayashi

PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS　3358　(3358)　16-27　2004

ISSN： 0302-9743
グリッド用動的資源管理のための自己組織化P2Pネットワークに関する一検討

瀧澤泰明, 滝沢寛之, 佐野健太郎, 小林広明, 中村維男

情報処理学会東北支部研究会　2003/11
Vector Quantization Codebook Design Restraining Edge Degradation of Images Peer-reviewed

TAKIZAWA Hiroyuki, MIURA Takeshi, KOBAYASHI Hiroaki, NAKAMURA Tadao

FIT2003 Information Technology Letters　2　(2)　243-244　2003/09
Vector quantization codebook design using the law-of-the-jungle algorithm Peer-reviewed

H Takizawa, T Nakajima, K Sano, H Kobayashi, T Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E86D　(6)　1068-1077　2003/06

ISSN： 0916-8532
A Comparison Study Of Vector Quantization Codebook Design Algorithms Based On The Equidistortion Principle Peer-reviewed

Hiroyuki Takizawa, Taira Nakajima, Kentaro Sano, Hiroaki Kobayashi, Tadao Nakamura

Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics(AI2003)　255-261　2003/03
A Decision Criterion to Relocate Codewords for Adaptive Vector Quantization Peer-reviewed

H. Takizawa

Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics(AI2003)　262-268　2003/02
Parallel Algorithm for the Law-of-the-Jungle Learning to the Fast Design of Optimal Codebooks Peer-reviewed

Kentaro Sano, Shintaro Momose, Hiroyuki Takizawa, Taira Nakajima, Clecio Donizete Lima, Hiroaki Kobayashi, Tadao Nakamura

Proceedings of the 14th IASTED International Conference on Parallel and Distributed Computing and Systems(PDCS2002)　723-728　2002/11
Practical Volume Compression based on Vector Quantization using the Law-of-the-Jungle Algorithm Peer-reviewed

Kentaro Sano, Hiroyuki Takizawa, Taira Nakajima, Hiroaki Kobayashi, Tadao Nakamura

Proceedings of the 2nd IASTED International Conference on Visualization, Imaging and Image Processing(VIIP2002)　519-526　2002/09
A Vector Quantizer preventing Image Degradation Peer-reviewed

Takeshi Miura, Hiroyuki Takizawa, Kentaro Sano, Taira Nakajima, Hiroaki Kobayashi, Tadao Nakamura

FIT Information Technology Letters　185-186　2002/09
Parallel processing for vector quantization codebook design

S. Momose, K. Sano, H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

並列/協調/分散処理に関する「湯布院」サマーワークショップ資料　2002/08
Updated Computer Systems of Integrated Information Processing Center, Niiagta University

Hiroyuki Takizawa

Yearly report of Integrated Information Processing Center, Niigata University　(13)　21-27　2002/03
PC-UNIX導入時の不正アクセス対策

滝沢寛之

新潟大学総合情報処理センター年報NIICE　12　(12)　13-19　2001/03
Publisher:
An active learning algorithm based on existing training data Peer-reviewed

H Takizawa, T Nakajima, H Kobayashi, T Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E83D　(1)　90-99　2000/01

ISSN： 0916-8532
A topology preserving neural network for nonstationary distributions Peer-reviewed

T Nakajima, H Takizawa, H Kobayashi, T Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E82D　(7)　1131-1135　1999/07

ISSN： 0916-8532
A self-organizing network system forming memory from nonstationary probability distributions Peer-reviewed

T. Nakajima, H. Takizawa, H. Kobayashi, T. Nakamura

Proceedings of IJCNN99　1999/07
Acceleration techniques for the network inversion algorithm Peer-reviewed

H Takizawa, T Nakajima, M Nishi, H Kobayashi, T Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E82D　(2)　508-511　1999/02

ISSN： 0916-8532
Application of the neural network (BPD with cross-talk links) to FSK demodulation Peer-reviewed

M. Nishi, J. Furuya, H. Takizawa, T. Nakamura

The trans. of the Japanese society of technical education　41　(1)　9-16　1999/01
Kohonen learning with a mechanism, the law of the jungle, capable of dealing with nonstationary probability distribution functions Peer-reviewed

T Nakajima, H Takizawa, H Kobayashi, T Nakamura

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS　E81D　(6)　584-591　1998/06

ISSN： 0916-8532
Facial image processing using wavelet transform

K. Iimura, H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers　1998
A method for improving classification capability of multilayer perceptrons Peer-reviewed

H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

The trans. of the IEICE　J80-D-II　(1)　390-393　1997/01
Facial expression recognition using neural networks capable of recognizing at an infant level Peer-reviewed

T. Nakajima, H. Takizawa, M. Simamura, H. Kobayashi, T. Nakamura

Proceedings of WAIMH 6th Congress　66-0　1996/07
A study of optimal learning methods in neural networks

H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

IPSJ Regional Symposium in Tohoku　1996
An automatic facial expression recognition system using neural networks

T. Nakajima, H. Takizawa, M. Shimamura, H. Kobayashi, T. Nakamura

IEICE Society Conference　1995
Facial image recognition using neural networks

H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers　1995

Show all ︎Show first 5

Misc. 60

Human Core Temperature Rise for Whole-Body Exposure at 1-100 GHz Under Different Ambient Conditions

億田龍太朗, 小寺紗千子, 滝沢寛之, 平田晃正

電子情報通信学会技術研究報告(Web)　124　(357(EST2024 94-122))　2025

ISSN： 2432-6380
Estimation of Number of Heat-Related Illness Patients in Eight Prefectures

高田旭登, 江川隆輔, 滝沢寛之, 平田晃正

電子情報通信学会大会講演論文集(CD-ROM)　2022　2022

ISSN： 1349-144X
Estimation of Elderly Heat Stroke Patients in Japan Considering Heat Adaptation

西村卓, 小寺紗千子, 滝沢寛之, 江川隆輔, 江川隆輔, 平田晃正

電子情報通信学会大会講演論文集(CD-ROM)　2020　2020

ISSN： 1349-144X
ベクトルプロセッサからFPGAへのタスクオフロードに関する一考察

土方康平, 上野知洋, 江川隆輔, 滝沢寛之, 佐野健太郎

電子情報通信学会技術研究報告　119　(371(VLD2019 54-93))　2020

ISSN： 0913-5685
RDMAを用いた密結合FPGAクラスタのメモリ間通信性能

上野知洋, 佐野健太郎, 土方康平, 滝沢寛之

電子情報通信学会技術研究報告　119　(18(RECONF2019 1-19)(Web))　2019

ISSN： 0913-5685
HPCMG-FVを用いたSX-ACEの性能評価

江川隆輔, 磯部洋子, 加藤季広, 小松一彦, 滝沢寛之, 小林広明, 撫佐昭裕

東北大学情報シナジーセンター大規模科学計算機システム広報SENAC　50　(3)　15-18　2017/07
Publisher: 東北大学サイバーサイエンスセンター
ISSN： 0286-7419
Xevolverによる大気・海洋結合マルチスケールモデルMSSGの性能最適化コード管理の評価

板倉憲一, 小松一彦, 江川隆輔, 滝沢寛之

ハイパフォーマンスコンピューティングと計算科学シンポジウム論文集　(2017)　12-12　2017/05/29
計算科学・計算機科学人材育成のためのスーパーコンピュータ無償提供利用報告情報科学研究科超高速情報処理論利用報告

滝沢寛之, 江川隆輔, 後藤英昭

東北大学情報シナジーセンター大規模科学計算機システム広報SENAC　50　(3)　23-27　2017
SX-ACEにおけるHPCG ベンチマークの性能評価

小松一彦, 江川隆輔, 磯部洋子, 緒方隆盛, 滝沢寛之, 小林広明

SENAC : 東北大学大型計算機センター広報　48　(3)　14-19　2015/07
Publisher: 東北大学サイバーサイエンスセンター
ISSN： 0286-7419
東北大学サイバーサイエンスセンター高速化推進研究活動報告書（第6号）

小林広明, 岡部公起, 滝沢寛之, 江川隆輔, 小松一彦, 大泉健治, 小野敏, 山下毅, 佐々木大輔, 森谷友映, 齋藤敦子, 撫佐昭裕, 松岡浩司, 渡部修他

2015/04
Auto-Tuning with Xevolver

20　(2)　3258-3261　2015
Publisher: 日本計算工学会
ISSN： 1341-7622
Xevolverを用いた自動チューニング

平澤将一, 肖熊, 滝沢寛之, 小林広明

計算工学会学会誌「計算工学」　20　(2)　14-17　2015
Heuristic Data Partitioning for Social Networking Service

2013　(34)　1-8　2013/12/09
マルチプラットフォームにおける最適化手法の効果に関する一検討

小松一彦, 佐々木俊英, 江川隆輔, 滝沢寛之, 小林広明

研究報告ハイパフォーマンスコンピューティング（HPC）　2013　(24)　1-7　2013/07/24
Publisher: 一般社団法人情報処理学会

More details Close

近年，HPC システムの多様化が進んでおり，特徴の異なる複数種類の HPC システムにおいて高い性能を引き出すことができる，性能可搬性の高い HPC コードの開発が強く求められている．本研究では，各種 HPC システム向けの最適化手法が HPC コードの性能に与える効果を詳細に解析し，その知見に基づいて性能可搬性の高い HPC コードを開発することを目的としている．本報告では，異なる手動最適化同士や自動最適化を組み合わせた場合の HPC コードの性能可搬性を解析する．HPC システムごとに，それぞれの手動最適化同士や自動最適化の組み合わせによる相乗効果を評価し，性能可搬性の低下を引き起こす可能性のある最適化について議論する．
チューニング対象の限定による効率の良い性能可搬性向上手法

平澤将一, 秋葉諒, 滝沢寛之, 小林広明

研究報告ハイパフォーマンスコンピューティング（HPC）　2013　(19)　1-8　2013/05/22
Publisher: 一般社団法人情報処理学会

More details Close

計算システムの多様化に伴い，既存の科学技術計算プログラムを新たな計算システムへ移植し性能を最適化する作業がしばしば求められている．しかしながら大規模な科学技術計算プログラムの移植および性能最適化には多大な労力が必要となり，問題となっている．本研究では，性能可搬性向上を目的とした場合に優先的に性能最適化を行うべきソースコードの箇所を限定し，効率良くアプリケーション全体の性能可搬性を向上させる手法を提案する．ベンチマークプログラムおよび実アプリケーションによる評価の結果，提案手法はアプリケーション全体の性能可搬性を効率よく向上させるために，最適化すべきソースコードの部位を限定できることが示された．
Message from the chairs of iWAPT 2012

Hiroyuki Takizawa, Richard Vuduc, Takeshi Iwashita

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)　7851　2013

DOI： 10.1007/978-3-642-38718-0 　

ISSN： 0302-9743 1611-3349
複合システムにおけるチェックポイントリスタート

滝沢寛之, 佐藤雅之, 江川隆輔, 小林広明

日本信頼性学誌　35　(7)　2013

DOI： 10.11348/reajshinrai.35.8_515 　
統合開発環境と連携するポータブルなビルドシステム

平澤将一, 滝沢寛之, 小林広明

研究報告ハイパフォーマンスコンピューティング（HPC）　2012　(28)　1-8　2012/09/26

More details Close

本研究では，性能可搬性を保ちつつアプリケーションを開発するためのフレームワーク構築に向けて，ポータブルなビルドシステムを開発する．現在の高性能計算 (High-Performance Computing, HPC) システムの構成は複雑化しており，アプリケーションを実行せずにその実効性能を予測することは困難である．このため本研究では，開発中のアプリケーションを定期的に実行し，その性能プロファイルを暗黙裡に取得して性能可搬性の低い個所を特定し，プログラマに対話的に提示することにより性能可搬性の維持を支援することを想定している．そのようなアプリケーション開発補助ツールを実現するためには，開発中のアプリケーションを暗黙裡に様々なシステム上でビルドし，実行する機能が必要である．本研究では，そのような可搬性を有するビルドシステムを開発し，アプリケーション開発支援環境として必要な機能を議論する．
Implementation and Evaluation of the Nanopowder Growth Simulation with OpenACC

2012　(10)　1-7　2012/09/26
大規模計算システムにおけるBCMの性能評価

小松一彦, 曽我隆, 江川隆輔, 滝沢寛之, 小林広明

SENAC : 東北大学大型計算機センター広報　45　(3)　17-25　2012/07
Publisher: 東北大学サイバーサイエンスセンター
ISSN： 0286-7419
Evaluation of GPU Computing Based on An Automatic Program Generation Technology

2011　(18)　1-7　2011/07/20
A Client-Level Deadline Scheduling Strategy for Volunteer Computing Systems

2011　45-54　2011/05/18
A Performance Tuning Strategy Based on the Roofline Model for Vector Processors

4　(3)　77-87　2011/05/12

ISSN： 1882-7829
東北大学サイバーサイエンスセンター高速化推進研究活動報告書（第5号）

小林広明, 岡部公起, 滝沢寛之, 江川隆輔, 伊藤英一, 大泉健治, 小野敏, 小久保達信, 橋本ユキ子, 磯部洋子, 撫佐昭裕, 神山典, 金野浩伸

2011/04
チップマルチベクトルプロセッサのためのプログラム最適化技術

佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

東北大学情報シナジーセンター大規模科学計算機システム広報SENAC　44　(2)　29-36　2011/04
A Self-Organized Overlay Network Management Mechanism for Heterogeneous Environments

Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

52　(2)　320-333　2011/02/15
Publisher: 情報処理学会
ISSN： 1882-7764
Energy Consumption of a Chip Multi-Vector Processor Using Real Applications

2010　(3)　1-8　2010/12/09
Publisher: 情報処理学会
ISSN： 1884-0930
An Out-of-order Vector Processing Mechanism for Multimedia Applications

GAO YE, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

2010　(24)　1-10　2010/07/27
Publisher: 情報処理学会
ISSN： 0919-6072
Performance Evaluation of GPU Computing with OpenCL

ARAI YUSUKE, SATO KATSUTO, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

2010　(11)　1-7　2010/02/15
Publisher: 情報処理学会
ISSN： 0919-6072
Implementation and Evaluation of a Checkpint/Restart Tool for CUDA Applications

TAKIZAWA HIROYUKI, SATO KATSUTO, KOMATSU KAZUHIKO, KOBAYASHI HIROAKI

122　(7)　G1-G7　2009/10/09
Publisher: 情報処理学会
ISSN： 0919-6072
RC-008 Client-Level Task Scheduling for Effective Volunteer Computing

Murata Yoshitomo, Endo Toshiaki, Takizawa Hiroyuki, Kobayashi Hiroaki

8　(1)　165-172　2009/08/20
Publisher: Forum on Information Technology
C-024 An Auction based Resource Allocation Considering Multifaceted Utilities in a Peer to Peer Environment

Satayapiwat Chainan, Komatsu Kazuhiko, Egawa Ryusuke, Takizawa Hiroyuki, Kobayashi Hiroaki

8　(1)　491-494　2009/08/20
Publisher: Forum on Information Technology

More details Close

Recently, many market-based approaches have been studied as one of the promising alternatives in a resource allocation problem. Especially, auction-based approaches are widely chosen due to its distributed nature and its relatively lower complexity. However, employing an auction to allocate jobs is only suitable for homogeneous environments of resources. This paper proposes an auction-based resource allocation mechanism which enables resource allocation in a heterogeneous environment while minimizing user's inputs. Our preliminary results show that our resource allocation mechanism improves the performance of important jobs during high-loaded.
C-023 Performance Evaluation towards BLAS with Automatic Processor Selection

Komatsu Kazuhiko, Koyama Kentaro, Sato Katsuto, Takizawa Hiroyuki, Kobayashi Hiroaki

8　(1)　485-490　2009/08/20
Publisher: Forum on Information Technology
Performance Optimization Techniques for Vector Processors with Cache Memory

SATO YOSHIEI, NAGAOKA RYUICHI, MUSA AKIHIRO, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, OKABE KOKI, KOBAYASHI HIROAKI

2009　(6)　1-10　2009/07/28
Publisher: 情報処理学会
ISSN： 0919-6072
SX-9による大規模並列シミュレーション(3.2 第7回情報シナジー研究会, 3. 研究活動報告)

曽我隆, 下村陽一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明, 高橋俊, 中橋和博

年報　8　88-93　2009/07
Publisher: 東北大学サイバーサイエンスセンター
Software Automatic Tuning Technologies for Scientific and Technical Computing : Software Automatic Tuning in GPU Computing

TAKIZAWA Hiroyuki

IPSJ Magazine　50　(6)　527-531　2009/06/15
Publisher: Information Processing Society of Japan (IPSJ)
ISSN： 0447-8053
Software Automatic Performance Tuning in GPU Computing

Hiroyuki TAKIZAWA

Journal of Information Processing Society of Japan　50　(6)　527-531　2009/06/15
創造工学研修の実施報告 ― スパコンを使って計算科学・計算機科学のおもしろさを体験 ―

滝沢寛之, 江川隆輔, 笹尾泰洋, 佐野健太郎, 山本悟, 小林広明

東北大学サイバーサイエンスセンター大規模科学計算システム広報SENAC　42　(2)　87-90　2009/02
624 A study of energy-aware GPU computing

Takizawa Hiroyuki, Sato Katuto, Kobayashi Hiroaki

The Computational Mechanics Conference　2008　(21)　558-559　2008/11/01
Publisher: The Japan Society of Mechanical Engineers
ISSN： 1348-026X
RC-006 Hardware Design of A Way-Allocatable Shared Cache Mechanism

Abe Kenta, Kotera Isao, Egawa Ryusuke, Takizawa Hiroyuki, Kobayashi Hiroaki

7　(1)　35-38　2008/08/20
Publisher: Forum on Information Technology
A programming language extension and its automatic optimization techniques for exploiting the potential of GPUs

SATO KATUTO, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

IPSJ SIG Notes　2008　(74)　199-204　2008/07/29
Publisher: Information Processing Society of Japan (IPSJ)
ISSN： 0919-6072

More details Close

GPUs have a great potencial of high-performance computing and have been used in various applications in addition to graphics processing. In order to achieve high-performance with GPUs, we have to carry out architecture-aware optimizations because of their unique architecture. We have proposed SPRAT, a programming language for hybrid systems of CPUs and CPUs, to realize both the portability of programs and the high computation effeciency. This paper proposes some automatic optimization techniques based on memory access adjustments. The results shows, significant performance improvements in the executions of Edge detection and LU decomposition.
On-Chip Cache Memory Systems for Next Vector Architectures

7　89-93　2008/07
Publisher: 東北大学サイバーサイエンスセンター
A Stream Programming Language for GPU Computing

TAKIZAWA Hiroyuki, SATO Katuto, KOBAYASHI Hiroaki

Journal of the Visualization Society of Japan　28　(1)　271-274　2008/07/01
Publisher: 可視化情報学会
ISSN： 0916-4731
ベクトルプロセッサ用キャッシュメモリの性能評価

佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

情報処理学会シンポジウム論文集　2008　(2)　55　2008/01/17

ISSN： 1344-0640
SPRAT : 実行時自動チューニング機能を備えるストリーム処理記述用言語

滝沢寛之

先進的計算基盤システムシンポジウム(SACSIS2008)　139-148　2008
I-004 A Parallel Image Generation Algorithm based on Partitioning of Photon Maps

Tamura Masahide, Takizawa Hiroyuki, Kobayashi Hiroaki

6　(3)　203-206　2007/08/22
Publisher: Forum on Information Technology
A Study on Dynamic Task Assignment to CPU and GPU Based on Runtime Performance Prediction

SHIRATORI Hiroki, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

IEICE technical report　107　(175)　37-42　2007/08/02
Publisher: The Institute of Electronics, Information and Communication Engineers
ISSN： 0913-5685

More details Close

Recent studies of general-purpose computation on graphics processing units (GPUs) have shown that a PC equipped with high performance CPU and GPU can be regarded as a heterogeneous parallel processing system. On the other hand, programming for such a system has become complicated. In order to exploit the potential of the system, unified programming models for the CPU and GPU have been studied. However, the selection of CPU or GPU that executes a program must be made manually and statically in most of the existing development tools for GPGPU applications. Because appropriate selection depends on some information determined at runtime, the processing efficiency improves if the appropriate processor can be dynamically selected based on the performance prediction at runtime. This paper examines the effectiveness of dynamically selecting the appropriate processor based on the execution time estimation and the the processor switching cost. The experimental results show that the cost of the processor switching except the data transfer is negligible and hence the processor switching can improve the performance if the execution time is long compared to the prediction error.
The Evaluation of A Way-Allocatable Shared Cache Mechanism

KOTERA ISAO, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

IPSJ SIG Notes　2007　(79)　31-36　2007/08/01
Publisher: Information Processing Society of Japan (IPSJ)
ISSN： 0919-6072

More details Close

We have proposed a way-allocatable shared cache mechanism for chip multiprocessors, which can save power consumption with remaining the performance by employing cache partitioning and power gating. In the proposed mechanism, a metric of cache access locality is defined and used for the cache partitioning and the power gating. Based on the metric, the proposed mechanism can flexibly change the configuration to be either performance-oriented or power-oriented. This paper evaluates the validity of the proposed mechanism, using some benchmarks with different cache access behaviors. The evaluation results show that the proposed mechanism can appropriately partition the shared cache for applications with high localities. In addition, our proposal at the performance-oriented mode can reduce energy consumption by 28% while improving the performance by 0.3%.
SC|06調査報告(3.2 第5回情報シナジー研究会, 3. 研究活動報告)

小野敏, 滝沢寛之, 小林広明

年報　6　83-87　2007/07
Publisher: 東北大学情報シナジーセンター
SC|05調査報告(3.2 第4回情報シナジー研究会, 3. 研究活動)

大泉健治, 伊藤英一, 滝沢寛之, 小林広明

年報　5　71-74　2006/06
Publisher: 東北大学情報シナジーセンター
A Runtime Optimization Method for Redundant Task Dispatch on P2P Computing Platforms.(3.2 第4回情報シナジー研究会, 3. 研究活動)

Wang Hong, Takizawa Hiroyuki, Kobayashi Hiroaki

年報　5　100-105　2006/06
Publisher: 東北大学情報シナジーセンター
実シミュレーションコードによる大規模科学計算システムの性能評価(3.2 第4回情報シナジー研究会, 3. 研究活動)

滝沢寛之, 岡部公起, 伊藤英一, 撫佐昭裕, 曽我隆, 伊藤学, 小林広明

年報　5　78-83　2006/06
Publisher: 東北大学情報シナジーセンター
HPCチャレンジでのSXシステムの性能評価(3.2 第3回情報シナジー研究会, 3. 研究活動)

小林広明, 滝沢寛之, 小久保達信, 岡部公起, 伊藤英一, 小林義昭, 浅見暁, 小林一夫, 後藤記一, 片海健亮, 深田大輔

年報　4　98-116　2005/05
Publisher: 東北大学情報シナジーセンター
HPC チャレンジでのSX システムの性能評価

小林広明, 滝沢寛之, 小久保達信, 岡部公起, 伊藤英一, 小林義昭, 浅見暁, 小林一夫, 後藤記一, 片海健亮, 深田大輔

東北大学情報シナジーセンター大規模科学計算機システム広報SENAC　38　(1)　5-28　2005/01
スーパーSINET を利用した大規模遠隔可視化処理の評価

滝沢寛之, 小林広明

東北大学情報シナジーセンター大規模科学計算機システム広報SENAC　37　(2)　5-10　2004/04
Performance Analysis of a Parallel Law-of-the-Jungle Algorithm for Generating Codebooks of Vector Quantization

MOMOSE Shintaro, SANO Kentaro, TAKIZAWA Hiroyuki, NAKAJIMA Taira, KOBAYASHI Hiroaki, NAKAMURA Tadao

IEICE technical report. Neurocomputing　103　(92)　25-30　2003/05/22
Publisher: The Institute of Electronics, Information and Communication Engineers
ISSN： 0913-5685

More details Close

Vector quantization is an attractive technique for lossy data compression, which has been a key technology for efficient data storage andlor transfer. So far, various algorithms have been proposed to design optimal codebooks presenting quantization with minimized errors. In particular, the Law-of-the-Jungle(LOJ) learning algorithm has been proposed to achieve rapid codebook design by algorithmic improvements. However, its acceleration is still required when large data sets are processed on a single computer. In order to achieve faster codebook design, we have been proposed a scalable parallel codebook design algorithm for parallel computers. This paper analyzes and evaluates the performance of the parallel LOJ learning algorithm on three types of parallel computers: an IBM SP2, an NEC AzusA and a PC cluster.
Parallel Codebook Generation for Optimal Vector Quantizer

MOMOSE Shintaro, SANO Kentaro, TAKIZAWA Hiroyuki, NAKAJIMA Taira, LIMA Clecio Donizete, KOBAYASHI Hiroaki, NAKAMURA Tadao

IPSJ SIG Notes　2002　(80)　67-72　2002/08/21
Publisher: Information Processing Society of Japan (IPSJ)
ISSN： 0919-6072

More details Close

Vector quantization is an attractive technique for lossy data compression, which has been a key technology for data storage and/or transfer. So far, various algorithms have been proposed to design optimal codebooks presenting quantization with minimized errors. In particular, the Law-of-the-Jungle(LOJ) learning algorithm has been proposed to achieve rapid codebook design by algorithmic improvements. However, its acceleration is still required when large data sets are processed on a single computer. Therefore, a scalable parallel codebook design algorithm for parallel computers is required. This paper presents a parallel algorithm for the LOJ learning, suitable for distributed-memory parallel computers with a message-passing mechanism. Experimental results indicate a high scalability of the, proposed parallel algdrithm on the IBM SP2 parallel com'puter with 32 processing elements.
ベクトル量子化のための並列コードブック生成アルゴリズムの性能評価(2.<特集>第1回情報シナジー研究会)

百瀬真太郎, 佐野健太郎, 滝沢寛之, 中島平, 小林広明, 中村維男, Clecio Donizete Lima, 東北大学大学院情報科学研究科, 東北大学大学院情報科学研究科, 東北大学情報シナジーセンター, 東北大学大学院工学研究科, 東北大学大学院情報科学研究科, 東北大学情報シナジーセンター, 東北大学大学院情報科学研究科

年報　2　33-42　2002/07/01

More details Close

ベクトル量子化は高効率なデータ圧縮手法であり、データの保存や転送において核となる技術である。これまでに、誤差の少ない量子化のための最適コードブックを生成する様々な手法が提案されており、中でもアルゴリズムの改良によってコードブック生成処理時間の短縮を図るLaw-of-the-Jungle(LOJ)アルゴリズムが注目を集めている。しかし、大きなデータセットを単一のCPUで処理する場合、アルゴリズムの改良による処理時間短縮には限界があり、並列処理によるさらなる速度向上が求められている。本論文では、メモリ分散型並列計算機に適した並列LOJアルゴリズムを提案する。IBM SP2、NEC AzusA、PCクラスタを用いて並列LOJアルゴリズムの性能評価を行なった結果、いずれもプロセッサ台数に対する高い速度向上率が得られた。
新潟大学総合情報処理センターコンピュータシステムの更新

滝沢寛之

新潟大学総合情報処理センター年報NIICE　(13)　21-27　2002/03
PC-UNIX 導入時の不正アクセス対策

滝沢寛之

新潟大学総合情報処理センター年報NIICE　(12)　13-19　2001/03

Show all ︎Show first 5

Books and Other Publications 15

Sustained Simulation Performance 2022

Michael M. Resch, Johannes Geber, Hiroaki Kobayashi, Hiroyuki Takizawa, Wolfgang Bez

Springer Cham　2024/03

ISBN: 9783031410727
VLSI Design and Test for Systems Dependability

Hiroyuki Takizawa, Ye Gao, Masayuki Sato, Ryusuke Egawa, Hiroaki Kobayashi

Springer Japan　2019/01
Advanced Software Technologies for Post-Peta Scale Computing

Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

Springer　2018/12
Sustained Simulation Performance 2016

Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, Reiji Suda

Springer-Verlang　2016
コンピュータ工学入門

鏡慎吾, 佐野健太郎, 滝沢寛之, 岡谷貴之, 小林広明

コロナ社　2015/04
Sustained Simulation Performance 2015

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

Springer-Verlang　2015
Sustained Simulation Performance 2014

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Springer-Verlang　2014
High Performance Computing on Vector Systems 2012

Hiroyuki Takizawa, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda

Springer-Verlang　2012
High Performance Computing on Vector Systems, 2012

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Springer-Verlang　2012
Software Automatic Tuning: From Concepts to State-of-the-Art Results

Katsuto Sato, Hiroyuki Takizawa, Kazuhiko Komatsu, Hiroaki Kobayashi

Springer-Verlang　2010
High Performance Computing on Vector Systems 2009

Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiro Musa, Takashi Soga, Yoko Isobe

Springer-Verlang　2009
High Performance Computing on Vector Systems 2007

Hiroaki Kobayashi, Akihiro Musa, Yoshiei Sato, Hiroyuki Takizawa, Koki Okabe

Springer-Verlang　2008
High Performance Computing on Vector Systems 2008

Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiro Musa, Takashi Soga, Yoichi Shimomura

Springer-Verlang　2008
New text for operating information processing devices

M. Yamamoto, H. Takziawa

2003/04
Text for operating information processing devices

I. Yamasaki, M. Hasegawa, H. Takziawa

2000/04

Show all Show first 5

Presentations 136

GPUコンピューティングの現状と展望 Invited

滝沢寛之

放射光学会年会（JSR2026）　2026/01/09
異種複数のスパコンの連携による津波シミュレーションの緊急実行 Invited

滝沢寛之

NEC HPC Forum　2025/11/25
Urgent Computing of Tsunami Damage Estimation on Geographically Distributed Computing Systems Invited

Hiroyuki Takizawa

SC25 NEC Forum　2025/11/17
The Cyberscience Center not only for Cyberscience

Hiroyuki Takizawa

40th Workshop on Sustained Simulation Performance　2025/10/14
Research and user support activities at Tohoku University Cyberscience Center Invited

Hiroyuki Takizawa

39th Workshop on Sustained Simulation Performance　2025/05/27
Operational experience of the largest vector supercomputer, AOBA-S Invited

Hiroyuki Takizawa

NUG Society Meeting 36　2025/05/13
Advanced resource management for urgent job execution in Connected Supercomputing Invited

Hiroyuki Takizawa

Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2025)　2025/03/21
ワークフローエンジンとの連携に基づく臨機応変なジョブスケジューリングの実現

滝沢寛之

第16回自動チューニング技術の現状と応用に関するシンポジウム（ATTA2024）　2024/12/26
スパコンAOBA-Sの性能評価と将来計画 Invited

滝沢寛之

太陽地球環境シミュレーション研究会　2024/12/24
New Strategies at Tohoku University Cyberscience Center

Hiroyuki Takizawa

38th Workshop on Sustained Simulation Performance　2024/12/12
ExpressHPC: towards "connected supercomputing" enabling on-demand job execution for disaster resilience.

Hiroyuki Takizawa, Tatsuyoshi Ohmura, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Yoshihiko Sato, Junko Yoshino, Akihiro Musa, Shunichi Koshimura

4th Combined Workshop on Interactive and Urgent High-Performance Computing (WIUHPC)　2024/11/18
Realizing Connected Supercomputing with dynamic and adaptive resource management Invited

Hiroyuki Takizawa

SC24 Nagoya University Booth Presentation　2024/11/18
10年後の情報基盤センターは地球と人類にいかに貢献するか？ Invited

滝沢寛之

第50回ASE研究会　2024/11/08
Connected Supercomputing with on-demand job execution for disaster mitigation and more… Invited

Hiroyuki Takizawa

Reality in Science, Art, and Humanities – paradigms of its media conditions　2024/10/21
Operational experience of the latest-generation SX-Aurora TSUBASA system, AOBA-S Invited

Hiroyuki Takizawa

37th Workshop on Sustained Simulation Performance　2024/06/17
Introduction of AOBA-S: The world’s largest SX-Aurora TSUBASA system operating at Tohoku University Invited

Hiroyuki Takizawa

NUG Society Meeting 35　2024/06/14
ML-based Autotuning of Quantum Annealing Schedule Invited

Hiroyuki Takizawa, Michael Zielewski, Keichi Takahashi, Yoichi Shimomura

Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2024)　2024/03/22
スパコンAOBAの運用開始と将来展望 Invited

滝沢寛之

Supercomputing JAPAN! 2024　2024/03/12
Automatic Parameter Tuning for Efficient Checkpointing International-presentation Invited

Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

28th Workshop on Sustained Simulation Performance　2018/10/10
Initial Evaluation of Basic Performance and Functionality of Aurora Invited

TAKIZAWA Hiroyuki

SX-Aurora TSUBASA Forum　2018/07/27
Automatic Parameter Tuning of Application-Level Incremental Checkpointing International-presentation Invited

Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

2018 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing　2018/03/27
Towards prediction of effective optimizations in performance engineering International-presentation

Hiroyuki Takizawa, Yuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa

27th Workshop on Sustained Simulation Performance　2018/03/22
Make full use of supercomputers! -- Importance and challenges for efficient use of supercomputers -- Invited

TAKIZAWA Hiroyuki

2018/03/16
User-Defined Code Transformation for Separation of Performance-Awareness from Application Codes International-presentation

Hiroyuki Takizawa

SIAM conference on parallel processing for scientific computing (mini-simposium)　2018/03/09
Auto-tuning of Hyperparameters of Machine Learning Models International-presentation

Zhen Wang, Ryusuke Egawa, Reiji Suda, Hiroyuki Takizawa

HPC Asia 2018　2018/01/29
Thermal-aware Dynamic Checkpoint Interval Tuning for High Performance Computing International-presentation

Pei Li, Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

HPC Asia 2018　2018/01/29
A User-defined Code Transformation Approach to Separation of Performance Concerns International-presentation

Hiroyuki Takizawa

First Workshop on Software Challenges to Exascale Computing　2017/12/17
大規模科学計算システムにおける利用者プログラムの特性分析

大泉健治, 山下毅, 穂苅寛光, 江川隆輔, 滝沢寛之, 小林広明

大学ICT推進協議会 2017年度年次大会 (AXIES2017)　2017/12/13
反応・相変化を伴う多分散系混相流シミュレーションコードの最適化

佐々木大輔, 加藤季広, 磯部洋子, 笠原弘貴, 渡部広吾輝, 志村啓, 奥野航平, 松尾亜紀子, 江川隆輔, 滝沢寛之, 小林広明

大学ICT推進協議会 2017年度年次大会 (AXIES2017)　2017/12/13
Expressing performance-awareness as user-defined code transformations International-presentation

Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa, Fumihiko Ino

International Symposium on Post Petascale System Software　2017/12/11
An Evolutionary Approach to Construction of a Software Development Environment for Massively-Parallel Heterogeneous Systems International-presentation

Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

Internationl Symposium on Post Petascale System Software　2017/12/11
Performance Engineering with User-defined Code Transformations International-presentation

Hiroyuki Takizawa

Joint Workshop on High-Performance Computing with NSCC-Wuxi and Tohoku University　2017/09/21
ExaFSA - Exascale Simulation of Fluid-Structure-Acoustics Interactions International-presentation

Florian Lindner, Miriam Mehl, Thorsten Reimann, Sabine Roller, Dörte C. Sternel, Hiroyuki Takizawa, Sander van Zujilen

ISC High Performance 2017　2017/07/18
Xevolverプロジェクト -- 計算科学と計算機科学をつなぐ架け橋を目指して --

滝沢寛之

高度情報科学技術研究機構平成28年度高速化ワークショップ　2017/03/24
Performance Tuning with Machine Learning International-presentation

Hiroyuki Takizawa, Cui Hang, Shoichi Hirasawa

The 25th Workshop on Sustained Simulation Performance　2017/03/13
Combining Code Transformations and Autotuning International-presentation

Hiroyuki Takizawa

2017 Advanced Topics and Auto-Tuning in High-Performance Scientific Computing 2017　2017/03/11
User-Defined Directive Translation for Automatic Tuning International-presentation

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2017 Advanced Topics and Auto-Tuning in High-Performance Scientific Computing 2017　2017/03/11
User-Defined Directive Translation Using the Xevolver Framework International-presentation

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

SIAM Computational Science and Engineering　2017/03/02
進化的アプローチによる超並列複合システム向け開発環境の創出

滝沢寛之

第8回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2016)　2016/12/26
Xevolverプロジェクトの概要

滝沢寛之

ポストペタワークショップ　2016/12/14
Autotuning meets Code Transformations International-presentation

Hiroyuki Takizawa

24th Workshop on Sustained Simulation Performance　2016/12/05
Making a Legacy Code Auto-Tunable without Messing It Up International-presentation

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC16)　2016/11/13
User-Defined Code Transformation for High Performance Portability International-presentation

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP16)　2016/04/12
Performance Engineering of HPC Applications Based on Pattern Matching International-presentation

Hiroyuki TAKIZAWA, Takeshi YAMADA, Takuya TSUNOGAWA, Shoichi HIRASAWA, Hiroaki KOBAYASHI

23rd Workshop on Sustained Simulation Performance　2016/03/16
Data layout optimization using user-defined code transformations International-presentation

Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, Hiroaki Kobayashi

2016 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing　2016/02/19
A Code Transformation Approach to Achieving High Performance Portability International-presentation

Hiroyuki TAKIZAWA, Daisuke TAKAHASHI, Reiji SUDA, Ryusuke EGAWA

SPPEXA Annual Plenary Meeting 2016　2016/01/25
進化的アプローチによる超並列複合システム向け開発環境の創出

滝沢寛之, 高橋大介, 須田礼仁, 江川隆輔

第7回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2015)　2015/12/25
Xevtgen: automatic generation of code transformation rules based on before-and-after codes International-presentation

Hiroyuki Takizawa, Shoichi Hirasawa, Reiji Suda

22nd Workshop on Sustained Simulation Performance　2015/12/17
The Xevolver Project: Separation of Concerns for Supporting Legacy Application Migration

Hiroyuki Takizawa

ATRG Open Academic Session　2015/12/11
機械工学分野におけるシミュレーション科学の新展開

滝沢寛之

学際大規模情報基盤共同利用・共同研究拠点第7回シンポジウム　2015/07/09
Framework for Separation of Concerns Between Application Requirements and System Requirements International-presentation

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

SIAM Conference on Computational Science & Engineering 2015　2015/03/16
Auto-Tuning with User-Defined Code Transformations International-presentation

Hiroyuki Takizawa

2015 Conerence on Advanced Topics and Auto-Tuning in High-Performance Scientific Computing　2015/02/26
What can we do to fight with system diversity? International-presentation

Hiroyuki Takizawa

21st Workshop on Sustained Simulation Performance　2015/02/18
進化的アプローチによる超並列複合システム向け開発環境の創出

滝沢寛之, 須田礼仁, 高橋大介, 江川隆輔

第6回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2014)　2014/12/25
Xevolver: an extensible framework for user-defined code transformation International-presentation

Hiroyuki Takizawa

20th Workshop on Sustained Simulation Performance　2014/12/15
Xevolver Project International-presentation

Hiroyuki Takizawa, Daisuke Takahashi, Reiji Suda, Ryusuke Egawa

International Symposium on Post Petascale System Software (ISP2S2) 2014　2014/12/02
Xevolver Project International-presentation

Hiroyuki Takizawa, Daisuke Takahashi, Reiji Suda, Ryusuke Egawa

Asian Technology Information Program (ATIP) Workshop at SC14　2014/11/17
機械工学分野におけるシミュレーション科学の新展開

滝沢寛之

学際大規模情報基盤共同利用・共同研究拠点第6回シンポジウム　2014/07/11
Evolutionary Adaptation of HPC Applications to Revolutionary System Changes International-presentation

Hiroyuki Takizawa

International Supercomputing Conference (ISC) 2014　2014/06/22
Xevolver: an extensible programming framework for cusom code transformation International-presentation

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing　2014/03/15
はやいぃスパコンは作れる！？

滝沢寛之

JACORN2013 Winter - 次世代 RHW 創造研究会　2013/12/26
進化的アプローチによる超並列複合システム向け開発環境の創出

滝沢寛之, 須田礼仁, 高橋大介, 江川隆輔

第5回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2013)　2013/12/25
An XML-based Programming Framework for User-defined Code Transformations International-presentation

Hiroyuki Takizawa, Xiong Xiao, Shoichi Hirasawa, Hiroaki Kobayashi

4th AICS International Symposium　2013/12/02
Xevolver : an XML-based Programming Framework for Software Evolution International-presentation

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC13)　2013/11/17
XMLを用いたツール間連携に向けて

滝沢寛之

1st XcalableMP Workshop　2013/11/01
Xevolver: towards an extensible programming environment for software evolution International-presentation

Hiroyuki Takizawa

International Symposium on Embedded Multicore/Many-core Systems-on-Chip　2013/09/26
OpenACCにおける性能チューニングとその効果

滝沢寛之, 平澤将一, 小松一彦, 小林広明

日本応用数理学会年会　2013/09/09
A Case Study of Performance Tuning with the POET Framework

肖熊, 平澤将一, 滝沢寛之, 小林広明

電気関係学会東北支部連合大会　2013/08/23
Code Refactoring for High Performance Computing Applications

Chunyan Wang, 平澤将一, 滝沢寛之, 小林広明

電気関係学会東北支部連合大会　2013/08/23
ブロックバイパス機構によるキャッシュのエネルギ効率化に関する研究

高井拓実, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2013/07/31
マルチプラットフォームにおける最適化手法の効果に関する一検討

小松一彦, 佐々木俊英, 江川隆輔, 滝沢寛之, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2013/07/31
Autotuning for Improving the Fault Tolerance of Large-scale Simulations International-presentation

Hiroyuki Takizawa, Alfian Amrizal, Shoichi Hirasawa, Hiroaki Kobayashi

Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing　2013/03/27
ソフトウェア進化のための自動性能追跡システム

平澤将一, 滝沢寛之, 小林広明

ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2013)　2013/01/15
プログラム自動生成技術に基づくGPUコンピューティングの性能評価

菅原誠, 佐藤功人, 小松一彦, 滝沢寛之, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2011/07/27
マイグレーションによる複合型計算システム向けジョブスケジューリング

小山賢太郎, 佐藤功人, 小松一彦, 村田善智, 滝沢寛之, 小林広明

先進的計算基盤システムシンポジウム(SACSIS2011)　2011/05/25
ルーフラインモデルに基づくベクトルプロセッサ向けプログラム最適化戦略

佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2011)　2011/01/18
実アプリケーションを用いたチップマルチベクトルプロセッサの消費エネルギ評価

永岡龍一, 佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 小林広明

ハイパフォーマンスコンピューティングとアーキテクチャの評価に関する北海道ワークショップ(HOKKE-18)　2010/12/16
Cache Partitioning Strategies for 3-D Stacked Vector Processors International-presentation

Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE 3D System Integration Conference 2010　2010/11/16
A Performance Tuning Strategy under Combining Loop Transforms for a Vector Processor with an On-Chip Cache International-presentation

Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

ACM/IEEE Supercomputing Conference (SC10)　2010/11/13
複合型計算システムにおける実行時自動チューニング

滝沢寛之

自動チューニング技術の現状と応用に関するシンポジウム　2010/11
A Runtime Task Reallocation Library for Heterogeneous Computational Environments International-presentation

Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

7th International Conference on Fluid Dynamics　2010/11/01
A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications International-presentation

Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Euromicro Conference on Digital System Design　2010/09/01
An Out-of-order Vector Processing Mechanism for Multimedia Applications

Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2010/08/03
Efficient Data Management for the Building Cube Method using Cartesian Meshes on the GPU Platform International-presentation

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

International Supercomputing Conference (ISC10)　2010/05/30
Parallel Processing of the Building-Cube Method on the GPU Platform International-presentation

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

22nd International Conference on Parallel Computational Fluid Dynamics　2010/05/17
Performance of SOR Methods on Vector Processor SX-9 International-presentation

Takashi Soga, Akihiro Musa, Koki Okabe, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

22nd International Conference on Parallel Computational Fluid Dynamics　2010/05/17
ハイブリッド型計算環境のためのプログラミングフレームワークSPRAT

小松一彦, 小山賢太郎, 佐藤功人, 滝沢寛之, 小林広明

先端的ネットワーク＆コンピューティングテクノロジワークショップ　2010/03
A High-level Programming Framework for Efficient Hybrid-architecture Computing International-presentation

Kazuhiko Komatsu, Kentaro Koyama, Katsuto Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

14th SIAM Conference on Parallel Processing for Scientific Computing Minisymposium　2010/02/24
OpenCL によるGPUコンピューティングの性能評価

荒井勇亮, 佐藤功人, 滝沢寛之, 小林広明

情報処理学会HPC研究会　2010/02/22
GPUを手軽にちゃんと使える環境の実現に向けて

東京工業大学計算世界観GCOEセミナー　2009/12/09
A High-level GPU Programming Framework for Fluid Dynamics Simulation International-presentation

Katsuto Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

6th International Conference on Fluid Dynamics　2009/11/04
新アーキテキチャへのアプローチ

自動チューニング技術の現状と応用に関するシンポジウム　2009/10/22
CUDAアプリケーション向けチェックポイント・リスタート機能の実装と評価

滝沢寛之, 佐藤功人, 小松一彦, 小林広明

情報処理学会HPC研究会　2009/10/09
実アプリケーションによるチップマルチベクトルプロセッサの性能評価

佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

次世代スーパコンピューティングコンシンポジウム　2009/10/07
三次元積層技術による次世代ベクトルキャッシュの設計と評価

船矢祐介, 永岡龍一, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

次世代スーパコンピューティングコンシンポジウム　2009/10/07
3D On-Chip Memory for the Vector Architecture International-presentation

Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

IEEE 3D System Integration Conference 2009　2009/09/28
Cellによる高性能計算の可能性を探る

日本機械学会2009年度年次大会　2009/09/15
Working Sets based Thread Scheduling with Cache Partitioning International-presentation

Masayuki Sato, Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

Parallel Architecture and Compilation Techniques (PACT)　2009/09/12
次世代プログラミング環境～多様なプロセッサを使いこなす～

FIT2009　2009/09/03
An Auction based Resource Allocation Considering Multifaceted Utilies in a Peer-to-Peer Environment

Chaianan Satayapiwat, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

FIT2009　2009/09/02
ボランティアコンピューティングの高効率化ためのクライアントレベルスケジューリング

村田善智, 遠藤聡明, 滝沢寛之, 小林広明

FIT2009　2009/09/02
プロセッサ自動選択機能を有するBLASの実現に向けた性能評価

小松一彦, 小山賢太郎, 佐藤功人, 滝沢寛之, 小林広明

FIT2009　2009/09/02
キャッシュメモリを有するベクトルプロセッサのためのプログラム最適化手法

佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2009/08/04
ワーキングセット評価に基づくスレッドスケジューリング

佐藤雅之, 小寺功, 江川隆輔, 滝沢寛之, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP)　2009/08/04
メモリ積層型3次元ベクトルプロセッサの評価

船矢祐介, 江川隆輔, 滝沢寛之, 小林広明

先端的計算基盤システムシンポジウム(SACSIS 2009)　2009/06/28
CPUとGPUを協調利用するソフトウェア開発環境

佐藤功人, 滝沢寛之, 小林広明

筑波大学計算科学研究センターGPGPU講習会/研究会　2009/06/24
Hiding Programming Complexity for GPU Computing

Suda laboratory , GPGPU sperial seminar　2009/06/11
ストリーム処理記述言語のGPU向け自動最適化の検討

佐藤功人, 滝沢寛之, 小林広明

先端的計算基盤システムシンポジウム(SACSIS 2009)　2009/05/28
Early Evaluation of a Memory-Stacked Vector Processor International-presentation

Yusuke Funaya, RyusukeEgawa, Hiroyuki Takizawa, Hiroaki Kobayashi

COOL Chips XII　2009/04/15
GPU向け線形代数ライブラリの性能評価

小山賢太郎, 佐藤功人, 小松一彦, 滝沢寛之, 小林広明

計算工学講演会　2009/04/13

More details Close

計算工学講演会論文集 Vol.14, no.1, pp.289—292, 2009
SX-9による大規模並列シミュレーション

曽我隆, 下村陽一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明, 高橋俊, 中橋和博

シナジー研究会　2009/02/13
実アプリケーションによるSX-9の性能評価

曽我隆, 下村陽一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2009)　2009/01/12
Caching on a Chip Multi Vector Processor International-presentation

Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

SC08　2008/11/15
ベクトルプロセッサ用キャッシュメモリにおけるMSHR の性能評価

佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

次世代スーパーコンピューティング・シンポジウム2008　2008/09/16
ウェイアロケーション型共有キャッシュ機構のハードウェア設計に関する研究

第7 回情報科学技術フォーラム(FIT2008)　2008/09/02
GPU を効率的に利用するための言語拡張と自動最適化手法

佐藤功人, 滝沢寛之, 小林広明

並列/協調/分散処理に関するサマーワークショップ(SWoPP2008)　2008/08/05
GPU コンピューティングのためのストリーム処理記述言語

第36 回可視化情報シンポジウム　2008/07/22
SPRAT: 実行時自動チューニング機能を備えるストリーム処理記述用言語

滝沢寛之, 白取寛貴, 佐藤功人, 小林広明

情報処理学会先進的計算基盤システムシンポジウム(SACSIS2008)　2008/06/11
分散協調型スケジューラを用いた大規模計算環境上での負荷分散手法の紹介

村田善智, 滝沢寛之, 小林広明

第２回InTrigger Community Workshop　2008/06/04
Auction-based Resource Allocation for activating incentives in resource trading in Grid Computing

Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

先端的ネットワーク＆コンピューティングテクノロジワークショップ　2008/03/13
Preliminary evaluation of a result checking mechanism for reliable volunteer computing

Ling Xu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

先端的ネットワーク＆コンピューティングテクノロジワークショップ　2008/03/13
A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination

Kazuhiko Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

2008 年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS2008)　2008/01/17
ベクトルプロセッサ用キャッシュメモリの性能評価

佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

2008 年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS2008)　2008/01/17
Early Evaluation of On-Chip Vector Caching for the NEC SX Vector ArchitectureEarly Evaluation of On-Chip Vector Caching for the NEC SX Vector Architecture International-presentation

Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

SC07　2007/11/14
Preliminary Evaluation for Runtime Auto-tuning of GPGPU Applications International-presentation

Hiroyuki Takizawa, Hiroki Shiratori, Hiroaki Kobayashi

The Second international Workshop on Automatic Performance Tuning　2007/09/20
フォトンマップ分割に基づく並列画像生成アルゴリズム

田村壮秀, 滝沢寛之, 小林広明

第6回情報科学技術フォーラム　2007/09/05
実行時性能予測に基づくCPUとGPUへの動的タスク割当の検討

白取寛貴, 滝沢寛之, 小林広明

並列/分散/協調処理に関するサマー・ワークショップ　2007/08/01
ウェイアロケーション型共有キャッシュ機構の性能評価

小寺功, 滝沢寛之, 小林広明

並列/分散/協調処理に関するサマー・ワークショップ　2007/08/01
遊休計算資源を用いたパラメータスイープ型並列計算におけるタスクスケジューラの性能評価

村田善智, 小田川雅人, 滝沢寛之, 小林広明

先端的ネットワーク&コンピューティングテクノロジワークショップ　2007/03
PS3を用いた分散コンピューティング環境の開発と評価

小田川雅人, 吉田向志, 村田善智, 滝沢寛之, 小林広明

先端的ネットワーク&コンピューティングテクノロジワークショップ　2007/03
ゲームユーザーのユビキタスコンピューティングプラットフォームへの参加を促すインセンティブモデルの検討

中田武男, 大庭信之, 滝沢寛之, 小林広明

先端的ネットワーク&コンピューティングテクノロジワークショップ　2007/03
描画用ハードウェアの活用によるふく射伝熱の対話的シミュレーションと可視化

滝沢寛之, 山田昇, 酒井清吾, 小林広明

第一回日本ヒートアイランド学会全国大会　2006/07/27
Performance Evaluation of SX-7 Using Real Simulation Codes

Hiroyuki Takizawa, Akihiro Musa, Takashi Soga, Yoshiaki Matsumura, Manabu Ito, Hiroaki Kobayashi

ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2006)　2006/01
A Distributed and Cooperative Load Balancing Mechanism for Large-scale P2P Systems

Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

先端的ネットワーク&コンピューティングテクノロジワークショップ　2005/10
P2Pコンピューティングのための分散協調スケジューリング機構

村田善智, 稲葉勉, 滝沢寛之, 小林広明

先端的ネットワーク＆コンピューティングテクノロジワークショップ　2005/01
A P2P Semantic Information Searching Mechanism for Ubiquitous Grid Computing Systems

Tsutomu Inaba, Takuro Ohkawa, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

先端的ネットワーク&コンピューティングテクノロジワークショップ　2005/01

Show all Show first 5

Research Projects 38

Programming Environments for High-performance Computing Competitive

System: JST Basic Research Programs (Core Research for Evolutional Science and Technology :CREST)

2011/10 - Present
High-performance low-power processor Competitive

System: Grant-in-Aid for Scientific Research

2003/03 - Present
ワークフローエンジンとの連携に基づく臨機応変なジョブスケジューリングの実現

滝沢寛之, 片桐孝洋, 佐野健太郎

Offer Organization: 日本学術振興会

System: 科学研究費助成事業

Category: 基盤研究(B)

Institution: 東北大学

2024/04/01 - 2027/03/31
宇宙初期における位相欠陥の一般相対論的シミュレーション

北嶋直弥, 神田行宏, 藤林翔, 滝沢寛之

Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点

System: 2025年度学際大規模情報基盤共同利用・共同研究拠点(JHPCN)公募型共同研究課題

Institution: 東北大学

2025/04 - 2026/03
Digital twin of a supercomputer for operation monitoring and automation

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Research (Exploratory)

Institution: Tohoku University

2022/06/30 - 2025/03/31
線状降水帯の気象場変化に対する応答の解明: WRFアンサンブル計算を用いて

平賀優介, 滝沢寛之

Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点(JHPCN)

System: 公募型共同研究課題

2024/04 - 2025/03
Development of system reliability improvement technology based on medium- to long-term failure prediction

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tokyo Denki University

2021/04/01 - 2024/03/31
Creation of Scalable Computers and their System Software for Post-Moore Era

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (A)

Institution: Institute of Physical and Chemical Research

2020/04/01 - 2024/03/31
Expanding Industrial Use of Innovative Technology for Transportation Equipment Design Using Microdevices Through Large-Scale Simulation

Offer Organization: Tohoku University Cyber Science Center

System: JHPCN:Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures

Institution: Tohoku University

2017 - 2024
Research, Development, and Application of Real Particle Simulations for Plasma Interdisciplinary Science

Hiroaki Ohtani, Shunsuke Usami, Hiroki Hasegawa, Toseo Moritaka, Masanori Nunami, Mieko Toida, Hideaki Miura, Seiji Ishiguro, Ritoku Horiuchi, Nobuaki Ohno, Shintaro Kawahara, Hideyuki Usui, Yohei Miyake, Mitsue Den, Tomoya Ogawa, Keiichiro Fukazawa, Takahiro Katagiri, Hiroyuki Takizawa

Offer Organization: Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN)

System: Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures: Joint Research Projects (General Joint Research Projects)

2023/04 -
日本全土の洪水氾濫被害と適応策の検討

峠嘉哉, 滝沢寛之, 風間聡, 山本道, 柳原駿太, 池本敦哉, 岡本彩果

Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点

System: 公募型共同研究

Category: 一般共同研究課題

Institution: 東北大学

2022/04 - 2023/03
日本全土の洪水氾濫被害の将来展望

風間聡, 滝沢寛之, 峠嘉哉, 柳原駿太

Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点

System: 公募型共同研究

Category: 一般共同研究課題

Institution: 東北大学

2021/04 - 2022/03
Creation of non-Neumann FPGA Overlay Architecture for Innovating HPC

Sano Kentaro

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

2017/04/01 - 2020/03/31

More details Close

We have developed fundamental technologies of non-Neumann overlay architecture to exploit FPGAs, which are circuit reconfigurable semiconductor devices, in order to achieve next-generation HPC systems instead of Neumann architectures which are slowing down in performance improvement. With a prototype of FPGA cluster, we have constructed its hardware and software framework, and developed a high-level synthesis compiler for computing problems to be implemented as data-flow circuits on FPGAs. We showed that a pipelining method can increase performance of several computing problems according to the number of FPGAs. This demonstrates that relatively low-power FPGAs can achieve high-performance and scalable computing.
Supporting performance-aware programming with machine learning techniques

Hiroyuki Takizawa, Kobayashi Hiroaki, Suda Reiji, Okatani Takayuki, Egawa Ryusuke, Ohshima Satoshi

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2016/04/01 - 2019/03/31

More details Close

This work has demonstrated some case studies of effectively using machine learning techniques for supporting High-Performance Computing (HPC) programming. Various problems in code optimization can be solved by converting the problems to the problems that have already been proven to be solved by machine learning. Moreover, this work clarified the importance of analyzing the target problems in advance of machine learning, because it is unlikely that a sufficient number of training data are available in code optimization problems. Moreover, as well as HPC programming, machine learning also needs knowledge and experiences of human experts. However, in machine learning, the problem is already parameterized, and hence can be solved if sufficiently-high performance is available.
Research on Software Autotuning Mechanism that evolves to unknown computing environments

SUDA Reiji, YASUGI Masahiro, KATAGIRI Takahiro

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: The University of Tokyo

2015/04/01 - 2018/03/31

More details Close

Autotuning is a technology aiming to attain good execution performance on various computational environments by preparing variabilities within software and letting the software itself control the variabilities. In this research, we aimed to develop methodology to infuse variabilities and control mechanism which are unintended or even unknown to existing codes, to attain autotuning even if novel computational environments and novel variabilities become newly known. We have shown that, by using Xevolver, which is developed by our team members, we can infuse variabilities and autotuning mechanisms which is unknown to the original code. However, it became clear that we need to fully analyze the original code before applying such infusions.
Design Space Exploration of Future Microprocessors using the post CMOS devices

EGAWA Ryusuke, Kobayashi Hiroaki, Takizawa Hiroyuki, Tada Jubee, Sato Masayuki, Uno Wataru, Toyoshima Takuya, Sakai Zentaro, Ogasawara Daisuke

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: Tohoku University

2015/04/01 - 2018/03/31

More details Close

In this research, for realizing a high energy efficiency microprocessor using novel device technologies in the post-Moore's era, expected to be practical around 2025, we have worked on circuits and memory subsystems designs. Regarding the circuit design, we worked on the design method of wave-pipelined circuits using CNFET. For the memory subsystem, we focus on a die stacking and STT-RAM technologies. We have examined the cache-bypass mechanism, the energy efficient data allocation method for the multi-bank memory, and the power-aware controlling mechanism for STT-RAM last-level caches.
A Green Microarchitecure in 5.5D-Design Era

EGAWA RYUSUKE, Kobayashi Hiroaki, Takizawa Hiroyuki, Sato Masayuki, Uno Wataru, Nishimura Shin, Hosokawa Mikio, Toyoshima Takuya

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2014/04/01 - 2017/03/31

More details Close

To clarify the design space of future microprocessors after the end of moor’s law, this research project focuses on vertical integration technologies such as 2.5D and 3D technologies using a through silicon via (TSV). Since the TSVs have a high potential of shortening the latency and reducing the power consumption in/of microprocessors and computing systems, these technologies are expected to overcome the limits of technology scaling. In this research, we explore the design space of the future microprocessors by aggressively using TSVs in various stacking granularities. The evaluation results show that appropriate usage of TSVs with considering a trade-off among performance, power, and cost can drastically improve the energy efficiency of the microprocessors and computer systems.
Checkpoint restart technologies for hierarchcal storages

Hiroyuki Takizawa, Uno Atsuya, Kobayashi Hiroaki, Egawa Ryusuke, Sato Yukinori

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: Tohoku University

2014/04/01 - 2016/03/31

More details Close

Assuming that the state of an application is periodically saved during its execution, we have considered an automatic tuning method for the frequency of saving the state to a hierarchical storage system, and also have discussed a way for reducing the time for writing the state to the storage. A promising approach to the reduction is to speculatively write data that will be written in the future at a high probability. Hence, one technical issue is how to predict such data. For the prediction, we need to analyze memory access patterns of the target application. Hence, we have developed a performance analysis tool for the purpose. The validity and effectiveness of these proposed methods are evaluated based on job scheduling simulation of a large-scale computing system.
A 3D Processor Architecture Co-Designed with Dependable Processing

Kobayashi Hiroaki, TAKIZAWA HIROYUKI, EGAWA RYUSUKE

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: Tohoku University

2014/04/01 - 2016/03/31

More details Close

The objective of this study is to establish a novel processor architecture that realize both high performance and high dependability in the execution of a wide variety of applications by using 3D die-stacking technology toward the post-Moore’s era. In particular, we have developed a 3D die-stacking memory subsystem architecture integrated with processor cores and its data management mechanism for highly power-efficient and high-throughput memory hierarchy. In addition, we have also developed on-line checkpoint/restart mechanism by using a 3D die-stacking on-chip memory to increase dependability of the processor. The proposed architecture has been evaluated quantitatively by using a wide variety of applications and its effectiveness and limitation have been clarified and discussed.
Infrastructures for accelerating the synergy effect of software-hardware co-design

Hiroyuki Takizawa, Kobayashi Hiroaki, Aoki Takafumi, Sano Kentaro, Egawa Ryusuke, Tada Jube, Ito Koichi

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2013/04/01 - 2016/03/31

More details Close

Assuming OpenCL as a standard environment for accelerator programming, we have pointed out some missing features for supporting more various accelerator architectures,and proposed OpenCL extensions. Although OpenCL has gradually become to be used for hardware description, OpenCL C is not necessarily appropriate for describing OpenCL kernels. Hence, we have designed and implemented high productivity languages for typical computations in the fields of image processing and high performance computing. In addition, we have proposed an automatic tuning method for performance parameters, which need to be adjusted for individual accelerators. The proposed method has been implemented for evaluating its performance impacts.
A Universal Memory Architecture Based on Device-Architecture Co-Design

Kobayashi Hiroaki, TAKIZAWA HIROYUKI, EGAWA RYUSUKE

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2013/04/01 - 2016/03/31

More details Close

The objective of this study is to establish a smart memory subsystem architecture that can consider memory access behaviors of applications and effectively manage data in the memory hierarchy in terms of performance and power efficiency. In particular, we have developed 1) a low-power/high-bandwidth cache architecture, 2) a cache management policy with an on-line evaluation of the memory request behavior of an application for reducing its working set in the memory hierarchy, 3) a cache partitioning mechanism to protect performance-sensitive shared data for chip multicore processors, 4)a memory address mapping mechanism with the performance/performance optimization by using an online-estimation of memory access behavior.
Application-Aware Highly Hierarchical Memory Architecture

KOBAYASHI Hiroaki, TAKIZAWA Hiroyuki, EGAWA Ryusuke

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: Tohoku University

2012/04/01 - 2014/03/31

More details Close

The objective of this study is to establish a novel on-chip memory architecture that can provide necessary memory resources to running applications under the consideration of their behaviors and requirements regarding a memory subsystem on a multi-core processor. In this study, we have developed a cache-resource management mechanism to realize energy-efficient high performance execution of multi-threaded applications on a multi-core processor. In cooperation with developed hardware functions of cache resizing and partitioning to reduce cache conflicts and maximize the efficiency of cache utilization, this mechanism can extract the potential of multi-core processors with a low-power consumption.
Study of Next-Generation CFD toward Petaflops Computers

NAKAHASHI Kazuhiro, YAMAMOTO Satoru, OBAYASHI Shigeru, KOBAYASHI Hiroaki, YAMAMOTO Kazuomi, SASAKI Daisuke, JEONG Shinkyu, TAKIZAWA Hiroyuki, EGAWA Ryusuke, KUROTAKI Takuji, ENOMOTO Shunji, IMAMURA Taro, TAKAHASHI Shun

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (S)

2009/05/11 - 2014/03/31

More details Close

This study was conducted aimed at solving the problems of the current CFD in the use of the aerodynamic designs of aircrafts, such as the physical model dependence of the computational results and the increase of the work load for treating complex geometries. The Building-Cube Method was proposed bearing the further performance improvement of computers in mind, and the various algorithm studies for practical use were conducted. One of the achievements was demonstrated by the world-leading large scale flow computation around a car using the K-computer. It is significant that the proposed CFD approach can treat extremely complicated and incomplete CAD data directly for the simulation. This can be a game-changing technology for aerodynamic design process of aircrafts and automobiles.
Study on a framework for auto generation and optimization of HPC accelerator architectures

SANO Kentaro, TAKIZAWA Hiroyuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Challenging Exploratory Research

Institution: Tohoku University

2011 - 2013

More details Close

We have focused on an algorithm domain of the stencil computation and cellular automata computation that is one of the representative high-performance computations, and then studied a framework to automatically generate their acceleration hardware for reconfigurable computation with FPGAs. In this project, we have developed a stencil compiler for an FPGA-based systolic array and a high-level synthesis compiler for FPGA-based stream-computing accelerators. They are significant and fundamental technologies for highly productive reconfigurable high-performance computation with FPGAs.
Technologies for realizing highly-efficient and highly-dependable heterogeneous computing systems in the post Petascale era.

TAKIZAWA Hiroyuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Young Scientists (B)

Institution: Tohoku University

2011 - 2012

More details Close

A virtualization technique has been proposed to hide the heterogeneous configuration of different processors, by automatic task allocation considering their strengths and weaknesses. OpenCL has also been applied to programming of large-scale systems of various computing nodes. For high dependability, a transparent checkpoint restart mechanism for OpenCL applications has been developed. This work also investigated the practicality and limitations of OpenACC.
Innovative 3D Design for the New Generation Vector Microarchitecture

KOBAYASHI Hiroaki, TAKIZAWA Hiroyuki, EGAWA Ryusuke

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2010 - 2012

More details Close

This study discusses a new design methodology for a microarchitecture of next-generation, low-power high-performance vector processors by using 3D die-stacking technology. A strategy for mixed design of conventional 2D design and TSV (Through-Silicon-Via)-based 3D design that realizes a good trade-off between them in the all level of on-chip units design has also been proposed. Through the performance evaluation of a prototyped 3D vector processor, the effectiveness of 3D design regarding power consumption and performance has been clarified.
High-perofrmance computing using graphics hardware Competitive

System: Grant-in-Aid for Scientific Research

2003/04 - 2011/09
Development of Auto-tuning Specification Language Towards Manycore and Massively Parallel Processing Era

KATAGIRI Takahiro, IMAMURA Toshiyuki, SUDA Reiji, KURODA Hisayasu, ITOH Shoji, IWASHITA Takeshi, TAKIZAWA Hiroyuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: The University of Tokyo

2009 - 2011

More details Close

In this research, the following development is made to establish auto-tuning(AT) facility for high performance execution on several computer environments.(1) Function extension to an AT language, named ABCLibScript, for multicore and massively parallel environment ;(2) Evaluation of the AT facility with multicore CPUs and GPUs ;(3) Evaluation of effectiveness of the AT facility on ABCLibScript by adapting several application software ;(4) Open the codes of preprocessor for the developed ABCLibScript as free software via the internet.
A High-Performance Computing Framework to Exploit Various Processors

TAKIZAWA Hiroyuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Young Scientists (B)

Institution: Tohoku University

2009 - 2010

More details Close

The purpose of this work is to achieve a high-performance computing framework that can exploit the computing power of each processor in a heterogeneous computing system while keeping the portability of source codes. For making good use of various computing resources, this work explores an auto-tuning mechanism of a high-level language, numerical libraries seamlessly used from the high-level language, and a job scheduling method.
Acceleration of large-scale data clustering Competitive

1999/10 - 2009/03
Study on Hardware-Software Collaborative Scheduling for Highly Efficient Multithreading

KOBAYASHI Hiroaki, NAKAMURA Tadao, SUZUKI Kenichi, TAKIZAWA Hiroyuki, EGAWA Ryusuke, SATO Yukinori, KOTERA Isao, FUNAYA Yusuke, SATO Masayuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2006 - 2009
Large-scale distributed computing with idle computers Competitive

System: Grant-in-Aid for Scientific Research

2003/03 - 2008/03
A study of a unified software development scheme in the heterogeneous multicore era

TAKIZAWA Hiroyuki

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Young Scientists (B)

Institution: Tohoku University

2007 - 2008
安全・安心なボランティアコンピューティングによる超大規模データマイニング

小林広明, 滝沢寛之

Offer Organization: 日本学術振興会

System: 科学研究費助成事業

Category: 特定領域研究

Institution: 東北大学

2007 - 2008

More details Close

本研究は, 家庭用ゲーム機の機能・性能を活用するボランティアコンピューティングによって, 大規模データマイニングを実現するための基盤技術を確立することを目的としている. 平成20年度には, ロケット噴射ノズル近辺での物理現象の解析を行う分散データマイニングシステムを構築し, PLAYSTATION 3およびInTriggerから構成されるボランティアコンピューティング環境で大規模データマイニングの実証実験を行った. その結果, 動的負荷分散の実施方法として従来通り集中型のタスクスケジューリングを用いる場合, 計算資源の増加に伴い動的負荷分散が効率的に行えなくなり, 大規模ボランティアコンピューティング環境で期待する性能を実現することができないことが示された. 一方, 本研究で提案している分散協調型スケジューリング機構では計算資源の台数が増加しても動的負荷分散を効率的に実施すること可能であることが明らかになった. 本評価実験より, 提案機構が大規模ボランティアコンピューティング環境における動的負荷分散を実現する有効な機構であることが明らかになった. また, 複数のプロジェクトに参加するボランティアが遊休計算能力を浪費しないために, ワーカ側でのスケジューリング手法も提案した. ボランティアコンピューティングの信頼性を高めるための仕組みとして, 計算結果の妥当性を効率的に確認する車法も提案した. 各ワーカの信頼度を定量化し, 計算結果妥当評価に基づいて信頼度を変化させることによって, 不正なワーカを検出できることをシミュレーションにより明らかにした. さらに, 家庭用ゲーム機が高い描画処理性能を有している点に着目し, その描画処理性能をデータマイニングのために利用する方法について検討し, そのようなプログラミングを容易に行うためのプログラミングフレームワークについても研究した.
安全・安心なボランティアコンピューティングによる超大規模データマイニング

小林広明, 滝沢寛之

Offer Organization: 日本学術振興会

System: 科学研究費助成事業

Category: 特定領域研究

Institution: 東北大学

2006 - 2006

More details Close

本年度には、代表的なデータマイニング手法の中でも特に高い演算性能が要求されるデータクラスタリング(Data Clustering, DC)とニューラルネットワーク(Neural Networks, NN)に着目し、それらの処理を家庭用ゲーム機で効率良く実行するための実装方法について検討した。具体的には,家庭用ゲーム機に搭載されている高性能プロセッサであるCell Broadband Engine(CBE)や、描画処理ユニット(Graphics Processing Unit, GPU)をデータマイニング処理に効果的に利用する方法について研究し、実装と定量的性能評価を行った。大規模P2Pコンピューティングに関する研究として、ネットワーク上に遍在する膨大な数の遊休計算機資源から、利用者の要望を満たす計算機資源を効率良く検索するための分散型計算資源管理機構について研究した。研究成果として、利用者からの要望には計算機のメモリアクセスの振舞いに見られるような時間的、空間的な局所性が存在し、それらの局所性を利用することで探索効率の飛躍的改善が可能であることが明らかにした。本年度は特に不均質な環境下での資源探索を考慮し、利用される頻度に応じてP2P通信の接続数を自動調整する仕組みについて検討した。また、膨大な数の計算機を連携させるための仕組みとして、完全分散型の動的負荷分散機構についても研究を進め、その基本制御方式を設計した。耐タンパー性計算による安全・安心な分散データマイニングシステムをボランティア計算基盤に実現するための準備として、本年度は開発環境の構築を行った。また、関連資料を収集するとともに、関係者との議論を行った。
多次元時系列データマイニングのためのクラスタリング手法とその並列化

滝沢寛之

Offer Organization: 日本学術振興会

System: 科学研究費助成事業

Category: 若手研究(B)

Institution: 東北大学

2003 - 2004

More details Close

データクラスタリングのためには最近傍のクラスタ探索(最近傍探索)のために高次元ベクトル間の距離計算を多くの回数行う必要があり、大規模な問題に適用する場合にはその計算負荷が大きな課題となる。本研究では平成15年度に、近年のパーソナルコンピュータ(PC)用描画ハードウェア(GPU)の急速な発展に着目し、一般的なGPUを並列プロセッサとして利用すること(GPGPU)で高速な最近傍探索を実現した。さらに、平成16年度はその研究成果を応用して、GPUとCPUとの協調によりデータクラスタリングを高速に行う手法を開発した。この手法は最近傍探索距離の有する2種類の並列性を効果的に利用可能であり、その成果は国際会議において最優秀論文賞を受賞するなど学術的に非常に高く評価された。また、データクラスタリングに適用可能な競合学習をPCクラスタで効果的に並列実行する手法を提案し、その成果が国際学術論文誌に掲載された。データマイニングの重要な要素である可視化についても引き続き検討し、北海道大学-東北大学間のスーパーSINETによる接続実験により、可視化サーバを対話的に遠隔利用できることを実証実験した。物理的に遠隔地にある演算サーバを利用してクラスタリング処理やその後のボリュームレンダリング等の可視化処理を行い、データマイニングに利用可能であることが実証された。その成果は学術論文誌に掲載予定である。 Chinrunguengらの手法は、部分歪みエントロピを用いてクラスタの最適性を評価することにより平均歪みを最小化する。しかし、適切なクラスタを形成するまでに多数回の繰返し計算が必要であり、時系列データの時間変化に対して迅速に追従できない可能性がある。本研究では、部分歪みエントロピに基づいて適切にクラスタを再配置する手法を新たに提案し、動画像の適応ベクトル量子化に適用することよって追従速度と歪み最小化性能との両立を実現できることを確認した。
An Intelligent Memory Architecture for 3D Graphics

KOBAYASHI Hiroaki, NAKAMURA Tadao, SUZUKI Ken-ichi, TAKIZAWA Hiroyuki, SANO Kentaro

Offer Organization: Japan Society for the Promotion of Science

System: Grants-in-Aid for Scientific Research

Category: Grant-in-Aid for Scientific Research (B)

Institution: Tohoku University

2002 - 2004

More details Close

We have the following achievements (1)High-performance graphics algorithm and its hardware We analyzed parallelism and locality of reference in a graphics algorithm based on the global illumination model, and designed a novel rendering pipeline architecture for this algorithm. In addition, we designed and developed a prototype hardware based on the architecture. Through the performance evaluation of the hardware, we showed its effectiveness for realizing interactive ray-tracing. Moreover, we designed a new high-performance algorithm for generating walkthrough animations. (2)Power-efficient memory mechanism For design of the intelligent memory architecture for mobile devices, a low-power mechanism for on-chip memory system was designed. In this mechanism, memory modules are activated and inactivated based on their activity during the program execution. We clarified the relationship between activated memory modules and sustained performance, and showed the effectiveness of power-aware computing for on-chip cache memory. (3)Data compression algorithms for graphics hardware. We applied vector quantization to volume data set to achieve efficient data compression, and designed a visualization algorithm that can directly visualize the compressed volume data. We also designed a novel data compression algorithm using data clustering for graphics hardware
Efficient active learning of neural networks Competitive

1995/04 - 1999/10

Show all Show first 5

Social Activities 2

GPUコンピューティングセミナー@東北大学

2009/12/17 -

More details Close

企業主催のセミナーにて、関連研究分野の最新の動向と今後の展望について講演
仙台高等専門学校専攻研究特別講義

2009/12/16 -

More details Close

仙台高等専門学校広瀬キャンパスにて特別講義

Media Coverage 1

Young HPC Researchers Take Global Stage

HPCwire

2014/05/15

Type: Other

Other 4

ExaFSA

More details Close

Developing numerical simulations of Fluid-Structure-Acoustic Iteractions
ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出

More details Close

これまでに開発されてきた貴重なソフトウェア資産をポストペタ世代の超並列複合システムへ円滑に移行する方法論の確立は、ここ数年で成し遂げなければならない重要課題であり、その作業を支援する開発環境の実現が強く望まれている。本研究では、既存のソフトウェア資産との親和性やソフトウェア開発の連続性を考慮し、既存のものをベースに新しい環境を創出する進化的アプローチによって超並列複合システム向けの開発環境の実現を目指す。すなわち、言語処理系、ライブラリ、実行時環境、支援ツール群、およびアプリケーションの各レベルで超並列複合システム向けのソフトウェア開発の新技術を開発し、それらに基づく開発環境を実現する。
対話的物理シミュレーションのラピッドプロトタイピング環境の構築

More details Close

本研究の目的は、対話的物理シミュレーションとそれに連携する写実的画像生成アプリケーションの開発を補助するため、現在一般的なゲーム機に搭載されている複数のプロセッサを容易に適材適所で利用可能な開発環境を実現することである。近年、ゲーム機の描画性能は飛躍的に向上し、実物と見間違うほどの画像を対話的に描画することが可能になりつつある。しかし、ゲーム画面が写実的であればあるほど、さらなる高品質な写実的画像を生成するためには物理法則に合わない動きの不自然さが顕著になる。したがって、プレーヤーに仮想現実感を与えるためには、ゲーム画面中に描画される人物や物体が物理法則の観点からみて自然に動く必要があり、対話性が求められるゲームの分野では対話的物理シミュレーションとそれに基づく写実的画像生成が今後ますます重要になる。このため、本研究ではゲーム開発の初期段階において高性能な対話的物理シミュレーションを容易に試作して試行錯誤するための環境を構築する。
ICTエコ社会を創造する安全・安心・安価なユビキタスコンピューティングプラットフォームの研究・開発

More details Close

情報通信分野でのエコロジーモデルの確立を目指し、社会に遍在する計算資源として活用する、ユビキタス時代の安心・安全・安価なボランティアコンピューティング基盤を研究開発する。特にボランティアコンピューティングの高効率化、高信頼化、および参加を促進するインセンティブモデルについて研究し、機密性の高い計算にも利用可能で、しかも従来の実装技術では実現困難な規模の大規模計算基盤を安価に提供するための基盤技術を確立する。