Details of the Researcher

PHOTO

Hiroyuki Takizawa
Section
Cyberscience Center
Job title
Professor
Degree
  • 博士(情報科学)(東北大学)

  • 修士(情報科学)(東北大学)

e-Rad No.
70323996

Research History 7

  • 2024/04 - Present
    Tohoku University

  • 2019/04 - Present
    Tohoku University Cyberscience Center Vice Director

  • 2017/01 - Present
    Tohoku University Cyberscience Center Professor

  • 2009/01 - 2016/12
    Associate Professor, Graduate School of Information Sciences, Tohoku University

  • 2004/04 - 2008/12
    Assistant Professor, Graduate School of Information Sciences, Tohoku University

  • 2003/03 - 2004/03
    Research Associate, Information Synergy Center, Tohoku University

  • 1999/10 - 2003/02
    Research Associate, Integrated Information Processing Center, Niigata University

Show all Show first 5

Committee Memberships 43

  • HPCIコンソーシアム 理事

    2024/07 - Present

  • 情報処理学会HPC研究会運営委員会 幹事(副主査)

    2021/04 - Present

  • HPCI Cooperative Service Committee Member

    2021/03 - Present

  • International Workshop on Automatic Performance Tuning Program Committee Program Committee Member

    2009 - Present

  • COOL Chips Conference Program Committee Program Committee Member

    2007 - Present

  • HPC Asia 2026 General Chair

    2025 - 2026

  • HPCI Cooperative Service Organizing and Working Group Working Group Chair

    2019/04 - 2021/03

  • 情報処理学会HPC研究会運営委員会 運営委員

    2015/04 - 2019/03

  • HPC Asia 2019 Program Committee Track co-chair

    2018 - 2019

  • ACM/IEEE Supercomputing Conference, Tutorials Committee Committee member

    2017 - 2019

  • Legacy HPC Application Migration (LHAM) Organizing Committee Member

    2013 - 2018

  • Auto-Tuning for Multicore and GPU (ATMG) Program Committee Member

    2012 - 2018

  • 情報処理学会システムアーキテクチャ研究会運営委員会 運営委員

    2013/04 - 2017/03

  • 情報処理学会ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS)プログラム委員会 委員

    2008/10 - 2017/03

  • 情報処理学会東北支部運営委員会 運営委員

    2014/04 - 2016/03

  • 情報処理学会Annual Meeting on Advanced Computing System and Infrastructure (ACSI) プログラム委員会 委員

    2014/04 - 2015/03

  • 情報処理学会論文誌コンピューティングシステム(ACS)編集委員会 ACS編集委員

    2011/04 - 2015/03

  • 情報処理学会東北支部庶務幹事 庶務幹事

    2012/04 - 2014/03

  • 情報処理学会 先進的計算基盤システムシンポジウム(SACSIS) プログラム委員会 委員

    2012/04 - 2014/03

  • 情報処理学会東北支部庶務幹事 庶務幹事

    2012/04 - 2014/03

  • 情報処理学会 先進的計算基盤システムシンポジウム(SACSIS) プログラム委員会 委員

    2012/04 - 2014/03

  • 情報処理学会東北支部広報幹事 広報幹事

    2010/04 - 2012/03

  • サイエンティフィック・システム研究会 アクセラレータ技術ワーキンググループ委員

    2009/09 - 2012/03

  • 情報処理学会HPC研究会運営委員会 運営委員

    2007/04 - 2011/03

  • 電子情報通信学会コンピュータシステム研究専門委員会 委員

    2005/04 - 2011/03

  • International Workshop on Automatic Performance Tuning (iWAPT) Program chair

    2025 -

  • HPCI連携サービス運営作業部会 委員

    2024/10 -

  • ICPP2021 Program Committee Member

    2021 -

  • ACM/IEEE Supercomputing Conference 2020 (SC20) Technical Program Committee Member

    2020/11 -

  • International Workshop on Large-scale HPC Application Modernization (LHAM) Program Committee Chair

    2018 -

  • HPC Asia 2018 Program Committee Member

    2018 -

  • HPC Asia 2018 Poster Chair

    2018 -

  • 情報処理学会ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS)プログラム委員会 プログラム委員長

    2016/06 -

  • International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) Program Committee Member

    2015/04 -

  • International Workshop on Software Engineering for Parallel Systems (SEPS) Program Committee Member

    2015 -

  • International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) Program Committee Member

    2015 -

  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS) Program Committee Member

    2014/06 -

  • International Workshop on Hardware-Software Co-Design for High Performance Computing (Co-HPC) Program Committee Member

    2014 -

  • ACM/IEEE Supercomputing Conference 2013 (SC13) Technical Program Committee Member

    2013/11 -

  • Auto-Tuning for Multicore and GPU (ATMG) Organizing Committee Chair

    2013/09 -

  • Legacy HPC Application Migration (LHAM) Organizing Committee Chair

    2013 -

  • International Workshop on Automatic Performance Tuning Organizing Committee Chair

    2012 -

  • International Workshop on Peer-to-Peer Networking (P2PNet'10) Program Committee Member

    2010/12 -

Show all ︎Show first 5

Professional Memberships 4

  • The Institute of Electronics, Information, and Communication Engineers

  • Association for Computing Machinery (ACM)

  • The Institute of Electrical and Electronics Engineers (IEEE)

  • Information Processing Society of Japan

Research Interests 3

  • parallel and distributed processing

  • computer architecture

  • High-performance computing

Research Areas 5

  • Informatics / High-performance computing /

  • Informatics / Intelligent informatics /

  • Informatics / Information networks /

  • Informatics / Computer systems /

  • Informatics / Software /

Awards 18

  1. Best Undergraduate Student Award

    2024/08

  2. IEEE Computer Society Japan Chapter xSIG Young Researcher Award

    2024/08 IEEE Computer Society Japan Chapter

  3. Best paper award at the 26th Workshop on Advances in Parallel and Distributed Computational Models

    2024/05 Combining lossy compression with multi-level caching for data staging over network

  4. Outstanding Effort Award

    2023/07 The 7th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming

  5. Best Workshop Paper Award

    2020/11 International Symposium on Computing and Networking (CANDAR20) Improving the Accuracy in SpMV Implementation Selection with Machine Learning

  6. IEEE Computer Society Japan Chapter xSIG Young Researcher Award

    2020/07 The 4th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming

  7. HPC IN ASIA POSTER AWARD

    2020/06 ISC High Performance Computing 2020 Challenges in Solving Scheduling Problems with the D-Wave Quantum Annealer

  8. Best Poster Award at COOL Chips 22

    2019/04 IEEE Symposium on Low-Power and High-Speed Chips An Energy Optimization Method for Hybrid In-Memory Checkpointing

  9. Best Paper Award

    2018/12 The Second International Workshop on Automation in Machine Learning and Big Data (AutoML 2018)

  10. Best Workshop Paper Award at CANDAR'18

    2018/11 International Symposium on Computing and Networking (CANDAR)

  11. Best Workshop Paper Award at CANDAR'15

    2015/12/10 International Symposium on Computing and Networking (CANDAR)

  12. Best Poster Award at COOL Chips XV

    2012/04 IEEE Symposium on Low-Power and High-Speed Chips

  13. Best Poster Award of HiPEAC '12

    2012/01

  14. The Poster Award

    2011/01 Netxt-generation supercomputing symposium

  15. Ishida Foundation Award

    2009/10/30 石田記念財団

  16. Nogushi Award

    2008/05/28 情報処理学会東北支部

  17. Funai Information Technology Encouragement Prize

    2006/04/22 船井情報科学振興財団

  18. ISPA'04 Best Paper Award

    2004/12/14 ISPA2004

Show all ︎Show 5

Papers 305

  1. Performance evaluation of the LBM simulations in fluid dynamics on SX-Aurora TSUBASA vector engine Peer-reviewed

    Xiangcheng Sun, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa, Xian Wang

    Computer Physics Communications 307 109411-109411 2025/02

    Publisher: Elsevier BV

    DOI: 10.1016/j.cpc.2024.109411  

    ISSN: 0010-4655

  2. Clustering Based Job Runtime Prediction for Backfilling Using Classification Peer-reviewed

    Hang Cui, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    Lecture Notes in Computer Science 40-59 2024/12/21

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-74430-3_3  

    ISSN: 0302-9743

    eISSN: 1611-3349

  3. Maximizing Energy Budget Utilization Using Dynamic Power Cap Control Peer-reviewed

    Sho Ishii, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    Lecture Notes in Computer Science 161-180 2024/12/21

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-74430-3_9  

    ISSN: 0302-9743

    eISSN: 1611-3349

  4. A Node Selection Method for on-Demand Job Execution with Considering Deadline Constraints Peer-reviewed

    Daiki Nakai, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    Lecture Notes in Computer Science 141-160 2024/12/21

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-74430-3_8  

    ISSN: 0302-9743

    eISSN: 1611-3349

  5. Leveraging Hardware Performance Counters for Predicting Workload Interference in Vector Supercomputers Peer-reviewed

    Shubham, Keichi Takahashi, Hiroyuki Takizawa

    International Conference on Parallel and Distributed Computing: Applications and Technologies (PDCAT) 2024/12

    DOI: 10.48550/arXiv.2410.18126  

  6. DRAS-OD: A Reinforcement Learning based Job Scheduler for On-Demand Job Scheduling in High-Performance Computing Systems Peer-reviewed

    Hang Cui, Keichi Takahashi, Hiroyuki Takizawa

    2024 Twelfth International Symposium on Computing and Networking (CANDAR) 21-29 2024/11/26

    Publisher: IEEE

    DOI: 10.1109/candar64496.2024.00011  

  7. Real-Time Phase Retrieval Using On-the-Fly Training of Sample-Specific Surrogate Models Peer-reviewed

    Ryota Koda, Keichi Takahashi, Hiroyuki Takizawa, Nozomu Ishiguro, Yukio Takahashi

    2024 Twelfth International Symposium on Computing and Networking (CANDAR) 59-66 2024/11/26

    Publisher: IEEE

    DOI: 10.1109/candar64496.2024.00015  

  8. A QA-Assisted Job Scheduler for Minimizing the Impact of Urgent Computing on HPC System Operation Peer-reviewed

    Tatsuyoshi Ohmura, Keichi Takahashi, Ryusuke Egawa, Hiroyuki Takizawa

    2024 Twelfth International Symposium on Computing and Networking Workshops (CANDARW) 197-203 2024/11/26

    Publisher: IEEE

    DOI: 10.1109/candarw64572.2024.00039  

  9. Modernizing an Operational Real-Time Tsunami Simulator to Support Diverse Hardware Platforms Peer-reviewed

    Keichi Takahashi, Takashi Abe, Akihiro Musa, Yoshihiko Sato, Yoichi Shimomura, Hiroyuki Takizawa, Shunichi Koshimura

    2024 IEEE International Conference on Cluster Computing (CLUSTER) 414-425 2024/09/24

    Publisher: IEEE

    DOI: 10.1109/cluster59578.2024.00043  

  10. XAI-Based Feature Importance Analysis on Loop Optimization Peer-reviewed

    Toshinobu Katayama, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 38 782-791 2024/05/27

    Publisher: IEEE

    DOI: 10.1109/ipdpsw63119.2024.00142  

  11. Combining Lossy Compression with Multi-Level Caching for Data Staging over Network Peer-reviewed

    Rei Aoyagi, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 41 212-221 2024/05/27

    Publisher: IEEE

    DOI: 10.1109/ipdpsw63119.2024.00059  

  12. Towards sub-10 nm spatial resolution by tender X-ray ptychographic coherent diffraction imaging Peer-reviewed

    Nozomu Ishiguro, Fusae Kaneko, Masaki Abe, Yuki Takayama, Junya Yoshida, Taiki Hoshino, Shuntaro Takazawa, Hideshi Uematsu, Yuhei Sasaki, Naru Okawa, Keichi Takahashi, Hiroyuki Takizawa, Hiroyuki Kishimoto, Yukio Takahashi

    Applied Physics Express 17 (5) 2024/05/01

    DOI: 10.35848/1882-0786/ad4846  

    ISSN: 1882-0778

    eISSN: 1882-0786

  13. AOBA: The Most Powerful Vector Supercomputer in the World Invited

    Hiroyuki Takizawa, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Kenji Oizumi, Satoshi Ono, Takeshi Yamashita, Atsuko Saito

    Sustained Simulation Performance 2022 71-81 2024/03/15

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-41073-4_6  

  14. Reuse distance-based shared LLC management mechanism for heterogeneous CPU-GPU systems Peer-reviewed

    Jiaheng Liu, Ryusuke Egawa, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    IEICE Electronics Express 21 (4) 20230520-20230520 2024/02/25

    Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)

    DOI: 10.1587/elex.21.20230520  

    eISSN: 1349-2543

  15. Current Status and Future of the ABINIT-MP Program

    Yuji MOCHIZUKI, Tatsuya NAKANO, Kota SAKAKURA, Hideo DOI, Koji OKUWAKI, Toshihiro KATO, Hiroyuki TAKIZAWA, Satoshi OHSHIMA, Tetsuya HOSHINO, Takahiro KATAGIRI

    Journal of Computer Chemistry, Japan 2024

    Publisher: Society of Computer Chemistry Japan

    DOI: 10.2477/jccj.2024-0022  

    ISSN: 1347-1767

    eISSN: 1347-3824

  16. Development Status of ABINIT-MP in 2023 Peer-reviewed

    Yuji MOCHIZUKI, Tatsuya NAKANO, Kota SAKAKURA, Koji OKUWAKI, Hideo DOI, Toshihiro KATO, Hiroyuki TAKIZAWA, Akira NARUSE, Satoshi OHSHIMA, Tetsuya HOSHINO, Takahiro KATAGIRI

    Journal of Computer Chemistry, Japan 23 (1) 4-8 2024

    Publisher: Society of Computer Chemistry Japan

    DOI: 10.2477/jccj.2024-0001  

    ISSN: 1347-1767

    eISSN: 1347-3824

  17. Association of nuclear cataract prevalence with UV radiation and heat load in lens of older people -five city study- Peer-reviewed

    Kotaro Kinoshita, Sachiko Kodera, Natsuko Hatsusaka, Ryusuke Egawa, Hiroyuki Takizawa, Eri Kubo, Hiroshi Sasaki, Akimasa Hirata

    Environmental Science and Pollution Research 30 (59) 123832-123842 2023/11/22

    Publisher: Springer Science and Business Media LLC

    DOI: 10.1007/s11356-023-31079-2  

    eISSN: 1614-7499

  18. Prototype of a Batched Quantum Circuit Simulator for the Vector Engine Peer-reviewed

    Keichi Takahashi, Toshio Mori, Hiroyuki Takizawa

    Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis 1499-1505 2023/11/12

    Publisher: ACM

    DOI: 10.1145/3624062.3624226  

  19. Conflict-aware workload co-execution on SX-aurora TSUBASA Peer-reviewed

    Riku Nunokawa, Yoichi Shimomura, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    CCF Transactions on High Performance Computing 4 (6) 425-438 2023/10/05

    Publisher: Springer Science and Business Media LLC

    DOI: 10.1007/s42514-023-00171-x  

    ISSN: 2524-4922

    eISSN: 2524-4930

    More details Close

    Abstract NEC SX-Aurora TSUBASA (SX-AT) is the latest vector supercomputer, consisting of host processors called Vector Hosts (VHs) and vector processors called Vector Engines (VEs). The goal of this work is to simultaneously use both VHs and VEs to increase the resource utilization and improve the system throughput by co-executing more workloads. One difficulty is that performance interferences among VH and VE workloads could occur because they share some computing resources and potentially compete to use the same resource at the same time, so-called resource conflicts. To achieve efficient workload co-execution, first, this paper experimentally investigates the performance interference between a VH and a VE, when each of the two processors executes a different workload. It is empirically shown that the frequency of system calls from the VE workload could be a good indicator to predict if the co-execution could cause severe performance interference, even though monitoring system calls requires a huge runtime overhead and it is impractical to simply use it for decision making of co-execution. Then, this paper proposes a workload co-execution strategy based on a practical approach to identifying a pair of VE and VH workloads that could cause severe performance interferences. Our evaluation results clearly demonstrate that the system call frequency can be used to predict if the workload can affect the performance of another co-executing workload, and VH’s CPU load can be a good approximation of the system call frequency. The proposed approach based on the CPU loads could accurately identify a pair of workloads causing frequent resource conflicts, and thus reduce the risk of severe performance interferences between co-executing workloads on an SX-AT system, resulting in shorter makespan without significantly increasing the turn-around time.

  20. Balancing exploitation and exploration in parallel Bayesian optimization under computing resource constraint Peer-reviewed

    Moto Satake, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 706-713 2023/05

    Publisher: IEEE

    DOI: 10.1109/ipdpsw59300.2023.00122  

  21. An Advantage Actor-Critic Deep Reinforcement Learning Method for Power Management in HPC Systems Peer-reviewed

    Fitra Rahmani Khasyah, Kadek Gemilang Santiyuda, Gabriel Kaunang, Faizal Makhrus, Muhammad Alfian Amrizal, Hiroyuki Takizawa

    Lecture Notes in Computer Science 94-107 2023/04/08

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-29927-8_8  

    ISSN: 0302-9743

    eISSN: 1611-3349

  22. Equivalence Checking of Code Transformation by Numerical and Symbolic Approaches Peer-reviewed

    Shunpei Sugawara, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

    Parallel and Distributed Computing, Applications and Technologies 373-386 2023/04/08

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-29927-8_29  

    ISSN: 0302-9743

    eISSN: 1611-3349

  23. Towards Priority-Flexible Task Mapping for Heterogeneous Multi-core NUMA Systems Peer-reviewed

    Yifan Jin, Mulya Agung, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    Parallel and Distributed Computing, Applications and Technologies 3-15 2023/04/08

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-29927-8_1  

    ISSN: 0302-9743

    eISSN: 1611-3349

  24. A Task-Parallel Runtime for Heterogeneous Multi-node Vector Systems Peer-reviewed

    Kazuki Ide, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    Parallel and Distributed Computing, Applications and Technologies 331-343 2023/04/08

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-29927-8_26  

    ISSN: 0302-9743

    eISSN: 1611-3349

  25. Xevolver for Performance Tuning of C Programs Invited

    Hiroyuki Takizawa, Shunpei Sugawara, Yoichi Shimomura, Keichi Takahashi, Ryusuke Egawa

    Sustained Simulation Performance 2021 85-93 2023/02/18

    Publisher: Springer International Publishing

    DOI: 10.1007/978-3-031-18046-0_6  

  26. Estimation of the number of heat illness patients in eight metropolitan prefectures of Japan: Correlation with ambient temperature and computed thermophysiological responses Peer-reviewed

    Akito Takada, Sachiko Kodera, Koji Suzuki, Mio Nemoto, Ryusuke Egawa, Hiroyuki Takizawa, Akimasa Hirata

    Frontiers in Public Health 11 2023/02/17

    Publisher: Frontiers Media SA

    DOI: 10.3389/fpubh.2023.1061135  

    eISSN: 2296-2565

    More details Close

    The number of patients with heat illness transported by ambulance has been gradually increasing due to global warming. In intense heat waves, it is crucial to accurately estimate the number of cases with heat illness for management of medical resources. Ambient temperature is an essential factor with respect to the number of patients with heat illness, although thermophysiological response is a more relevant factor with respect to causing symptoms. In this study, we computed daily maximum core temperature increase and daily total amount of sweating in a test subject using a large-scale, integrated computational method considering the time course of actual ambient conditions as input. The correlation between the number of transported people and their thermophysiological temperature is evaluated in addition to conventional ambient temperature. With the exception of one prefecture, which features a different Köppen climate classification, the number of transported people in the remaining prefectures, with a Köppen climate classification of Cfa, are well estimated using either ambient temperature or computed core temperature increase and daily amount of sweating. For estimation using ambient temperature, an additional two parameters were needed to obtain comparable accuracy. Even using ambient temperature, the number of transported people can be estimated if the parameters are carefully chosen. This finding is practically useful for the management of ambulance allocation on hot days as well as public enlightenment.

  27. Toward Building a Digital Twin of Job Scheduling and Power Management on an HPC System Peer-reviewed

    Tatsuyoshi Ohmura, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

    Lecture Notes in Computer Science 47-67 2023/01/12

    Publisher: Springer Nature Switzerland

    DOI: 10.1007/978-3-031-22698-4_3  

    ISSN: 0302-9743

    eISSN: 1611-3349

  28. Efficient Pause Location Prediction Using Quantum Annealing Simulations and Machine Learning. Peer-reviewed

    Michael R. Zielewski, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    IEEE Access 11 104285-104294 2023

    DOI: 10.1109/ACCESS.2023.3317698  

  29. Performance Evaluation of a Next-Generation SX-Aurora TSUBASA Vector Supercomputer. Peer-reviewed

    Keichi Takahashi, Soya Fujimoto, Satoru Nagase, Yoko Isobe, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

    ISC High Performance 359-378 2023

    DOI: 10.1007/978-3-031-32041-5_19  

  30. A Real-time Flood Inundation Prediction on SX-Aurora TSUBASA Peer-reviewed

    Yoichi Shimomura, Akihiro Musa, Yoshihiko Sato, Atsuhiko Konja, Guoqing Cui, Rei Aoyagi, Keichi Takahashi, Hiroyuki Takizawa

    2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC) 27 192-197 2022/12

    Publisher: IEEE

    DOI: 10.1109/hipc56025.2022.00035  

  31. mdx: A Cloud Platform for Supporting Data Science and Cross-Disciplinary Research Collaborations Peer-reviewed

    Toyotaro Suzumura, Akiyoshi Sugiki, Hiroyuki Takizawa, Akira Imakura, Hiroshi Nakamura, Kenjiro Taura, Tomohiro Kudoh, Toshihiro Hanawa, Yuji Sekiya, Hiroki Kobayashi, Yohei Kuga, Ryo Nakamura, Renhe Jiang, Junya Kawase, Masatoshi Hanai, Hiroshi Miyazaki, Tsutomu Ishizaki, Daisuke Shimotoku, Daisuke Miyamoto, Kento Aida, Atsuko Takefusa, Takashi Kurimoto, Koji Sasayama, Naoya Kitagawa, Ikki Fujiwara, Yusuke Tanimura, Takayuki Aoki, Toshio Endo, Satoshi Ohshima, Keiichiro Fukazawa, Susumu Date, Toshihiro Uchibayashi

    2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech) 2022/09/12

    Publisher: IEEE

    DOI: 10.1109/dasc/picom/cbdcom/cy55231.2022.9927975  

  32. A SYCL-based high-level programming framework for HPC programmers to use remote FPGA clusters Peer-reviewed

    Satoshi Kaneko, Hiroyuki Takizawa, Kentaro Sano

    International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies 92-94 2022/06/09

    Publisher: ACM

    DOI: 10.1145/3535044.3535058  

  33. A Conflict-Aware Capacity Control Mechanism for Deep Cache Hierarchy Peer-reviewed

    Jiaheng LIU, Ryusuke EGAWA, Hiroyuki TAKIZAWA

    IEICE Transactions on Information and Systems E105.D (6) 1150-1163 2022/06/01

    Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)

    DOI: 10.1587/transinf.2021edp7201  

    ISSN: 0916-8532

    eISSN: 1745-1361

  34. Towards Conflict-Aware Workload Co-execution on SX-Aurora TSUBASA Peer-reviewed

    Riku Nunokawa, Yoichi Shimomura, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    Lecture Notes in Computer Science 163-174 2022/03/16

    Publisher: Springer International Publishing

    DOI: 10.1007/978-3-030-96772-7_16  

    ISSN: 0302-9743

    eISSN: 1611-3349

  35. Evaluating the Performance and Conformance of a SYCL Implementation for SX-Aurora TSUBASA Peer-reviewed

    Jiahao Li, Mulya Agung, Hiroyuki Takizawa

    Lecture Notes in Computer Science 36-47 2022/03/16

    Publisher: Springer International Publishing

    DOI: 10.1007/978-3-030-96772-7_4  

    ISSN: 0302-9743

    eISSN: 1611-3349

  36. A Method for Reducing Time-to-Solution in Quantum Annealing Through Pausing Peer-reviewed

    Michael Ryan Zielewski, Hiroyuki Takizawa

    International Conference on High Performance Computing in Asia-Pacific Region 7 137-145 2022/01/07

    Publisher: ACM

    DOI: 10.1145/3492805.3492815  

  37. A Cost Model for Compilers Based on Transfer Learning. Peer-reviewed

    Yuta Sasaki, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    IPDPS Workshops 942-951 2022

    DOI: 10.1109/IPDPSW55747.2022.00152  

  38. Automated selection of build configuration based on machine learning. Peer-reviewed

    Reo Furuhata, Minglu Zhao, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa

    IPDPS Workshops 934-941 2022

    DOI: 10.1109/IPDPSW55747.2022.00151  

  39. Spatiotemporal Anomaly Detection for Large-Scale Sensor Data Peer-reviewed

    Minglu Zhao, Hiroyuki Takizawa, Tomoya Soma

    2021 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP) 2021/12/10

    Publisher: IEEE

    DOI: 10.1109/paap54281.2021.9720310  

  40. Portability of Vectorization-aware Performance Tuning Expertise across System Generations Peer-reviewed

    Shunpei Sugawara, Yoichi Shimomura, Ryusuke Egawa, Hiroyuki Takizawa

    2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) 30 242-248 2021/12

    Publisher: IEEE

    DOI: 10.1109/mcsoc51149.2021.00043  

  41. A memory bank conflict prevention mechanism for SYCL on SX-Aurora TSUBASA Peer-reviewed

    Wenbin Wang, Jiahao Li, Yohichi Shimomura, Hiroyuki Takizawa

    2021 Ninth International Symposium on Computing and Networking Workshops (CANDARW) 2 217-222 2021/11

    Publisher: IEEE

    DOI: 10.1109/candarw53999.2021.00043  

  42. Evaluating I/O Acceleration Mechanisms of SX-Aurora TSUBASA Peer-reviewed

    Yuta Sasaki, Ayumu Ishizuka, Mulya Agung, Hiroyuki Takizawa

    2021 IEEE International Parallel & Distributed Processing Symposium Workshops 2021/05

  43. Evaluation of flood damage reduction throughout Japan from adaptation measures taken under a range of emissions mitigation scenarios Peer-reviewed

    Tao Yamamoto, So Kazama, Yoshiya Touge, Hayata Yanagihara, Tsuyoshi Tada, Takeshi Yamashita, Hiroyuki Takizawa

    Climatic Change 165 (60) 2021/04

    Publisher: Springer Science and Business Media LLC

    DOI: 10.1007/s10584-021-03081-5  

    ISSN: 0165-0009

    eISSN: 1573-1480

  44. OpenCL-like offloading with metaprogramming for SX-Aurora TSUBASA Peer-reviewed

    Hiroyuki Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa

    Parallel Computing 102754-102754 2021/02

    Publisher: Elsevier BV

    DOI: 10.1016/j.parco.2021.102754  

    ISSN: 0167-8191

  45. Preemptive Parallel Job Scheduling for Heterogeneous Systems Supporting Urgent Computing Peer-reviewed

    Mulya Agung, Yuta Watanabe, Henning Weber, Ryusuke Egawa, Hiroyuki Takizawa

    IEEE Access 9 17557-17571 2021

    Publisher: Institute of Electrical and Electronics Engineers (IEEE)

    DOI: 10.1109/access.2021.3053162  

    eISSN: 2169-3536

  46. neoSYCL: a SYCL implementation for SX-Aurora TSUBASA Peer-reviewed

    Yinan Ke, Mulya Agung, Hiroyuki Takizawa

    International Conference on High Performance Computing in ASia-Pacific Region 2021/01

  47. Improving Quantum Annealing Performance on Embedded Problems Invited Peer-reviewed

    Michael Zielewski, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    Supercomputing Frontiers and Innovations 2020/12

  48. Failure Prediction in Datacenters Using Unsupervised Multimodal Anomaly Detection Peer-reviewed

    Minglu Zhao, Reo Furuhata, Mulya Agung, Hiroyuki Takizawa, Tomoya Soma

    The IEEE BigData 2020, the third international conference on the Internet of Things Data Analytics (IoTDA) 2020/12

  49. A Conflict-Aware Capacity Control Mechanism for Last-Level Cache Peer-reviewed

    Jiaheng Liu, Ryusuke Egawa, Mulya Agung, Hiroyuki Takizawa

    Proceedings - 2020 8th International Symposium on Computing and Networking Workshops, CANDARW 2020 416-420 2020/11/01

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/CANDARW51189.2020.00085  

  50. Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA Peer-reviewed

    Ryusuke Egawa, Souya Fujimoto, Tsuyoshi Yamashita, Daisuke Sasaki, Yoko Isobe, Yoichi Shimomura, Hiroyuki Takizawa

    The 11th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS’20) 2020/11

  51. Improving the accuracy in SpMV implementation selection with machine learning Peer-reviewed

    Reo Furuhata, Minglu Zhao, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    The Eighth International Conference on Computing and Networking Workshops (CANDARW) 2020/11

  52. Polymorphic Data Layout for SX-Aurora TSUBASA Vector Engines Peer-reviewed

    Naoki Ebata, Yoko Isobe, Ryusuke Egawa, Hiroyuki Takizawa

    The Eighth International Conference on Computing and Networking (CANDAR) 2020/11

  53. ベイズ最適化による洪水シミュレーションコードの負荷分散自動調整 Peer-reviewed

    Ayumu Ishiduka, Tsuyoshi Yamashita, Ryusuke Egawa, Hiroyuki Takizawa, Tao Yamamoto, So Kazama

    The 4-th cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG2020) 2020/07

  54. Quantum Compiler : Automatic Vectorization Assisted by Quantum Annealer Peer-reviewed

    Yuta Sasaki, Michael Ryan Zielewski, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    The ISC High Performance 2020 (poster) 2020/06

  55. Challenges in Solving Scheduling Problems with the D-Wave Quantum Annealer Peer-reviewed

    Michael Ryan Zielewski, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    The ISC High Performance 2020 (poster) 2020/06

  56. Automatically Avoiding Memory Access Conflicts on SX-Aurora TSUBASA Peer-reviewed

    Naoki Ebata, Ryusuke Egawa, Yoko Isobe, Ryoji Takaki, Hiroyuki Takizawa

    2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2020/05

    Publisher: IEEE

    DOI: 10.1109/ipdpsw50202.2020.00139  

  57. Task Priority Control for the HPX Runtime System Peer-reviewed

    Suhang Jiang, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2020/05

    Publisher: IEEE

    DOI: 10.1109/ipdpsw50202.2020.00137  

  58. Comparison of Direct and Indirect Networks for High-Performance FPGA Clusters Peer-reviewed

    Antoniette Mondigo, Tomohiro Ueno, Kentaro Sano, Hiroyuki Takizawa

    Applied Reconfigurable Computing. Architectures, Tools, and Applications 314-329 2020/04

    Publisher: Springer International Publishing

    DOI: 10.1007/978-3-030-44534-8_24  

    ISSN: 0302-9743

    eISSN: 1611-3349

  59. Xevolver: A code transformation framework for separation of system-awareness from application codes Peer-reviewed

    Kazuhiko Komatsu, Ayumu Gomi, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda, Hiroyuki Takizawa

    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 32 (7) 2020/04

    DOI: 10.1002/cpe.5577  

    ISSN: 1532-0626

    eISSN: 1532-0634

  60. Online MPI Process Mapping for Coordinating Locality and Memory Congestion on NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    Supercomputing Frontiers and Innovations 7 (1) 71-90 2020/03

    Publisher: FSAEIHE South Ural State University (National Research University)

    DOI: 10.14529/jsfi200104  

    ISSN: 2313-8734

  61. Exafsa: Parallel fluid-structure-acoustic simulation

    Florian Lindner, Amin Totounferoush, Miriam Mehl, Benjamin Uekermann, Neda Ebrahimi Pour, Verena Krupp, Sabine Roller, Thorsten Reimann, Dörte C. Sternel, Ryusuke Egawa, Hiroyuki Takizawa, Frédéric Simonis

    Lecture Notes in Computational Science and Engineering 136 271-300 2020

    Publisher: Springer

    DOI: 10.1007/978-3-030-47956-5_10  

    ISSN: 2197-7100 1439-7358

  62. Preliminary Evaluation towards Task Priority Control in HPX Peer-reviewed

    Suhang Jiang, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster) 2020/01

  63. Acceleration of Hyper-Parameter Auto-Tuning with Parallelization and Time Constraints Peer-reviewed

    Chaoyi Zhang, Ryusuke Egawa, Hiroyuki Takizawa

    International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster) 2020/01

  64. An Optimization Technology of Software Auto-Tuning Applied to Machine Learning Software Peer-reviewed

    Toshiki Tabeta, Naoto Seki, Akihiro Fujii, Teruo Tanaka, Hiroyuki Takizawa

    International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020) (poster) 2020/01

  65. DeLoc: A Locality and Memory Congestion-aware Task Mapping Method for Modern NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    IEEE Access 2020

  66. An OpenCL-like Offload Programming Framework for SX-Aurora TSUBASA Peer-reviewed

    Hiroyuki Takizawa, Shinji Shiotsuki, Naoki Ebata, Ryusuke Egawa

    The 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2019) 285-291 2019/12

  67. Peachy Parallel Assignments (EduHPC 2019)

    Mulya Agung, Allen Malony, Hiroyuki Takizawa, David P. Bunde, Muhammad A. Amrizal, Steven Bogaerts, Ryusuke Egawa, Daniel A. Ellsworth, Jorge Fernandez-Fabeiro, Arturo Gonzalez-Escribano, Sukhamay Kundu, Alina Lazar

    Proceedings of EduHPC 2019: Workshop on Education for High Performance Computing - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis 75-83 2019/11/01

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/EduHPC49559.2019.00015  

  68. An Automatic MPI Process Mapping Method Considering Locality and Memory Congestion on NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) 17-24 2019/09

  69. Optimization of a gas-particle flow solver on vector supercomputers Peer-reviewed

    Yoichi Shimomura, Midori Kano, Takashi Soga, Kenta Yamaguchi, Akihiro Musa, Yusuke Mizuno, Shun Takahashi, Ryusuke Egawa, Hiroyuki Takizawa

    The 31st International Conference on Parallel Computational Fluid Dynamics (ParCFD’2019) 1-4 2019/06

  70. Memory First : A Performance Tuning Strategy Focusing on Memory Access Patterns Peer-reviewed

    Naoki Ebata, Ryusuke Egawa, Yoko Isobe, Ryoji Takaki, Hiroyuki Takizawa

    The ISC High Performance conference 2019 (poster) 2019/06

  71. Scaling performance for n-body stream computation with a ring of FPGAs Peer-reviewed

    Jens Huthmann, Abiko Shin, Artur Podobas, Kentaro Sano, Hiroyuki Takizawa

    The International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART2019) 1-6 2019/06

  72. Scalability Analysis of Deeply Pipelined Tsunami Simulation with Multiple FPGAs Peer-reviewed

    Antoniette Mondigo, Tomohiro Ueno, Kentaro Sano, Hiroyuki Takizawa

    IEICE Transactions on Information and Systems E102-D (5) 1029-1036 2019/05

    Publisher: Institute of Electronics, Information and Communications Engineers (IEICE)

    DOI: 10.1587/transinf.2018RCP0007  

    ISSN: 0916-8532

    eISSN: 1745-1361

  73. An Energy Optimization Method for Hybrid In-Memory Checkpointing Peer-reviewed

    Muhammad Alfian Amrizal, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)(poster) 2019/04

  74. The Impacts of Locality and Memory Congestion-aware Thread Mapping on Energy Consumption of Modern NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS) 2019/04

  75. Performance Evaluation of Different Implementation Schemes of an Iterative Flow Solver on Modern Vector Machines Peer-reviewed

    Kenta Yamaguchi, Takashi Soga, Yoichi Shimomura, Thorsten Reimann, Kazuhiko Komatsu, Ryusuke Egawa, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Supercomputing Frontiers and Innovations 6 (1) 36-47 2019/03

    DOI: 10.14529/jsfi190106  

  76. Xevolver: A user-defined code transformation approach to streamlining legacy code migration

    Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

    Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project 163-181 2018/12/06

    Publisher: Springer Singapore

    DOI: 10.1007/978-981-13-1924-2_9  

  77. Enhancing memory bandwidth in a single stream computation with multiple FPGAs Peer-reviewed

    Antoniette Mondigo, Kentaro Sano, Hiroyuki Takizawa

    The 2018 International Conference on Field-Programmable Technology (FPT’18) 2018/12

  78. Automatic hyperparameter tuning of machine learning models under time constraints Peer-reviewed

    Zhen Wang, Agung Mulya, Ryusuke Egawa, Reiji Suda, Hiroyuki Takizawa

    IEEE Big Data 2018 Workshop 2018/12

  79. A Locality and Memory Congestion-aware Thread Mapping Method for Modern NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    ACM/IEEE Supercomputing Conference 2018 (SC18) (poster) 2018/11

  80. Preconditioner auto-tuning with deep learning for sparse iterative algorithms Peer-reviewed

    Kenya Yamada, Takahiro Katagiri, Hiroyuki Takizawa, Kazuo Minami, Mitsuo Yokokawa, Toru Nagai, Masao Ogino

    The Sixth International Symposium on Computing and Networking Workshops (CANDARW 2018), LHAM workshop 2018/11

  81. Investigating the Effects of Dynamic Thread Team Size Adjustment for Irregular Applications Peer-reviewed

    Xiong Xiao, Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    The Sixth International Symposium on Computing and Networking (CANDAR 2018) 2018/11

  82. A Failure Prediction-based Adaptive Checkpointing Method with Less Reliance on Temperature Monitoring for HPC Applications Peer-reviewed

    Muhammad Alfian Amrizal, Pei Li, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa

    2018 IEEE International Conference on Cluster Computing, FTS workshop 483-491 2018/09

  83. A Machine Learning-based Approach for Selecting SpMV Kernels and Matrix Storage Formats Peer-reviewed

    Hang Cui, Shoichi Hirasawa, Hiroaki Kobayashi, Hiroyuki Takizawa

    IEICE Transactions on Information and Systems E101-D (9) 2307-2314 2018/09

  84. Expressing the Differences in Code Optimizations between Intel Knights Landing and NEC SX-ACE Processors

    Hiroyuki Takizawa, Thorsten Reimann, Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

    The 13th World Congress on Computational Mechanics/2nd Pan American Congress on Computational Mechanics 2018/07

  85. Performance Estimation of Deeply Pipelined Fluid Simulation on Multiple FPGAs with High-speed Communication Subsystem Peer-reviewed

    Antoniette Mondigo, Ketnaro Sano, Hiroyuki Takizawa

    The 29th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2018) 10-12 2018/07

  86. MIGRATING AN OLD VECTOR CODE TO MODERN VECTOR MACHINES Peer-reviewed

    Hiroyuki Takizawa, Kenta Yamaguchi, Takashi Soga, Thorsten Reimann, Kazuhiko Komatsu, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

    30th International Conference on Parallel Computational Fluid Dynamics 2018/04

  87. 反応・相変化を伴う多分散系混相流シミュレーションコードの最適化

    佐々木, 大輔, 加藤, 季広, 磯部, 洋子, 笠原, 弘貴, 渡部, 広吾輝, 志村, 啓, 奥野, 航平, 松尾, 亜紀子, 江川, 隆輔, 滝沢, 寛之, 小林, 広明

    SENAC : 東北大学大型計算機センター広報 51 (1) 47-51 2018/01

    Publisher: 東北大学サイバーサイエンスセンター

    ISSN: 0286-7419

    More details Close

    紀要類(bulletin)

  88. Use of Code Structural Features for Machine Learning to Predict Effective Optimizations. Peer-reviewed

    Yuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa

    33rd IEEE International Parallel & Distributed Processing Symposium Workshops(IPDPSW), International Workshop on Automatic Performance Tuning 1049-1055 2018

    Publisher: IEEE Computer Society

    DOI: 10.1109/IPDPSW.2018.00163  

  89. Energy-Performance Modeling of Speculative Checkpointing for Exascale Systems Peer-reviewed

    Muhammad Alfian Amrizal, Atsuya Uno, Yukinori Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E100D (12) 2749-2760 2017/12

    DOI: 10.1587/transinf.2017PAP0002  

    ISSN: 1745-1361

  90. Optimizing Energy Consumption on HPC Systems with a Multi-Level Checkpointing Mechanism Peer-reviewed

    Muhammad Alfian Amrizal, Hiroyuki Takizawa

    2017 IEEE International Conference on Networking, Architecture, and Storage, NAS 2017 - Proceedings 2017/09/06

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/NAS.2017.8026868  

  91. Potential of a modern vector supercomputer for practical applications: performance evaluation of SX-ACE Peer-reviewed

    Ryusuke Egawa, Kazuhiko Komatsu, Shintaro Momose, Yoko Isobe, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

    JOURNAL OF SUPERCOMPUTING 73 (9) 3948-3976 2017/09

    DOI: 10.1007/s11227-017-1993-y  

    ISSN: 0920-8542

    eISSN: 1573-0484

  92. A customizable auto-tuning scenario with user-defined code transformations Peer-reviewed

    Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Daisuke Takahashi

    Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 1372-1378 2017/06/30

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/IPDPSW.2017.79  

  93. 機械学習によるコード最適化の可能性

    滝沢寛之, 崔航, 平澤将一

    計算工学講演会論文集 22 2017/06

  94. データレイアウト最適化のためのコード変換規則の自動生成

    Takeshi Yamada, Shoichi Hirasawa, Reiji Suda, Hiroyuki Takizawa

    IPSJ SIG Technical Reports (HPC) 2017-HPC-158 (28) 1-8 2017/03

  95. シナリオテンプレートを用いた自動チューニングに関する研究

    Daichi Sato, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ National Convention 2017 (1) 45-46 2017/03

  96. Toward Dynamic Load Balancing across OpenMP Thread Teams for Irregular Workloads Peer-reviewed

    Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    International Journal of Networking and Computing 7 (2) 387-404 2017

    Publisher: IJNC Editorial Committee

    DOI: 10.15803/ijnc.7.2_387  

    ISSN: 2185-2839

    More details Close

    In the field of high performance computing, massively-parallel many-core processors such as Intel Xeon Phi coprocessors are becoming popular because they can significantly accelerate various applications. In order to efficiently parallelize applications for such many-core processors, several high-level programming models have been proposed. The de facto standard programming model mainly for shared-memory parallel processing is OpenMP. For hierarchical parallel processing, OpenMP version 4.0 or later allows programmers to create multiple thread teams. Each thread team contains a bunch of newly-created synchronizable threads. When multiple thread teams are used to execute an application, it is important to have dynamic load balancing across thread teams, since static load balancing easily encounters load imbalance across teams, and thus degrades performance. In this paper, we first motivate our work by clarifying the benefit of using multiple thread teams to execute an irregular workload on a many-core processor. Then, we demonstrate that dynamic load balancing across those thread teams has a potential of significantly improving the performance of irregular workloads on a many-core processor, with considering the scheduling overhead. Although such a dynamic load balancing mechanism has not been provided by the current OpenMP specification, the benefits of dynamic load balancing across thread teams are discussed through experiments using the Intel Xeon Phi coprocessor. We evaluate the performance gain of dynamic load balancing across thread teams using a ray tracing code. The results show that such a dynamic load balancing mechanism can improve the performance by up to 14% compared to static load balancing across teams, with considering scheduling overhead.

  97. A Directive Generation Approach to High Code-Maintainability for Various HPC Systems. Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Int. J. Netw. Comput. 7 (2) 405-418 2017

  98. Vectorization-aware Loop Optimization with User-defined Code Transformations Peer-reviewed

    Hiroyuki Takizawa, Thorsten Reimann, Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Akihiro Musa, Hiroaki Kobayashi

    2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) 685-692 2017

    DOI: 10.1109/CLUSTER.2017.102  

    ISSN: 1552-5244

  99. Performance and Power Analysis of SX-ACE using HP-X Benchmark Programs Peer-reviewed

    Ryusuke Egawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Akihiro Musa, Hiroaki Kobayashi, Yoko Isobe, Toshihiro Kato, Souya Fujimoto

    2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) 693-700 2017

    DOI: 10.1109/CLUSTER.2017.65  

    ISSN: 1552-5244

  100. An Application-Level Incremental Checkpointing Mechanism with Automatic Parameter Tuning Peer-reviewed

    Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

    2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 389-394 2017

    DOI: 10.1109/CANDAR.2017.96  

    ISSN: 2379-1888

  101. Designing an Open Database of System-aware Code Optimizations Peer-reviewed

    Ryusuke Egawa, Kazuhiko Komatsu, Hiroyuki Takizawa

    2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 369-374 2017

    DOI: 10.1109/CANDAR.2017.102  

    ISSN: 2379-1888

  102. A Memory Congestion-aware MPI Process Placement for Modern NUMA Systems Peer-reviewed

    Mulya Agung, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa

    2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC) 152-161 2017

    DOI: 10.1109/HiPC.2017.00026  

    ISSN: 1094-7256

  103. Directive Translation for Various HPC Systems Using the Xevolver Framework Invited

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2016 109-117 2016/12

    DOI: 10.1007/978-3-319-46735-1_9  

  104. Making a Legacy Code Auto-tunable without Messing It Up Peer-reviewed

    Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference 2016 (SC16) 2016/11

  105. A Power-Performance Tradeoff of HBM by Limiting Access Channels Peer-reviewed

    Takuya Toyoshima, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of IEEE Symposium on Low-Power and High-Speed Chips 2016/04

  106. アプリケーション適応型キャッシュリサイズのためのバイパス機構 Peer-reviewed

    Masayuki Sato, Takumi Takai, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    電子情報通信学会論文誌 J99-D (3) 2016/03

  107. 機械学習を用いたコード変換に関する研究

    川原畑 勇希, 平澤 将一, 滝沢 寛之, 小林 広明

    電気関係学会東北支部連合大会講演論文集 2016 227-227 2016

    Publisher: 電気関係学会東北支部連合大会実行委員会

    DOI: 10.11528/tsjc.2016.0_227  

  108. Automatic Parameter Tuning of Stencil Computation Using Directives Peer-reviewed

    Takuya Tsunogawa, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing Systems 2016

  109. A Cache Partitioning Mechanism to Protect Shared Data for CMPs Peer-reviewed

    Masayuki Sato, Shin Nishimura, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2016 IEEE SYMPOSIUM IN LOW-POWER AND HIGH-SPEED CHIPS (COOL CHIPS XIX) 2016

    DOI: 10.1109/CoolChips.2016.7503674  

    ISSN: 2473-4683

  110. Translation of Large-Scale Simulation Codes for an OpenACC Platform Using the Xevolver Framework. Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken'ichi Itakura, Hiroaki Kobayashi

    Int. J. Netw. Comput. 6 (2) 167-180 2016

  111. A Code Selection Mechanism Using Deep Learning Peer-reviewed

    Hang Cui, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2016 IEEE 10TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC) 385-392 2016

    DOI: 10.1109/MCSoC.2016.46  

  112. The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads Peer-reviewed

    Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 529-535 2016

    DOI: 10.1109/CANDAR.2016.48  

    ISSN: 2379-1888

  113. A Directive Generation Approach Using User-defined Rules Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 515-521 2016

    DOI: 10.1109/CANDAR.2016.94  

    ISSN: 2379-1888

  114. A User-Defined Code Transformation Approach to Overlapping MPI Communication with Computation Peer-reviewed

    Yasuharu Hayashi, Hiroyuki Takizawa, Hiroaki Kobayashi

    2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 508-514 2016

    DOI: 10.1109/CANDAR.2016.35  

    ISSN: 2379-1888

  115. Xevdriver: A software system supporting XML-based source-to-source code transformations on Fortran programs Peer-reviewed

    Reiji Suda, Hiroyuki Takizawa

    2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 522-528 2016

    DOI: 10.1109/CANDAR.2016.113  

    ISSN: 2379-1888

  116. Performance Evaluation of Compiler-Assisted OpenMP Codes on Various HPC Systems Invited

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2015 147-157 2015/12

    DOI: 10.1007/978-3-319-20340-9_12  

  117. A Light-Weight Rollback Mechanism for Testing Kernel Variants in Auto-Tuning Peer-reviewed

    Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E98D (12) 2178-2186 2015/12

    DOI: 10.1587/transinf.2015PAP0028  

    ISSN: 1745-1361

  118. An approach to the highest efficiency of the HPCG benchmark on the SX-ACE supercomputer Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Yoko Isobe, Ryusei Ogata, Hiroyuki Takizawa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference 2015 (SC15) 1-2 2015/11

  119. Expressing system-awareness as code transformations for performance portability across diverse HPC systems Peer-reviewed

    Hiroyuki Takizawa, Shoichi Hirasawa, Kazuhiko Komatsu, Ryusuke Egawa, Hiroaki Kobayashi

    International Workshop on Portability Among HPC Architectures for Scientific Applications 2015 1-6 2015/11

  120. FLEXII: A Flexible Insertion Policy for Dynamic Cache Resizing Mechanisms Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON ELECTRONICS E98C (7) 550-558 2015/07

    DOI: 10.1587/transele.E98.C.550  

    ISSN: 1745-1353

  121. Xevolver による実アプリケーションの性能と保守性の両立

    平澤将一, 滝沢寛之, 小林広明

    計算工学講演会論文集 20 4p 2015/06

    Publisher:

  122. Performance Evaluation of an OpenMP Parallelization by Using Automatic Parallelization Information

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2014 119-126 2015

    Publisher: Springer International Publishing

    DOI: 10.1007/978-3-319-10626-7_10  

  123. A Data Management Policy for Energy-Efficient Cache Mechanisms

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2015 61-75 2015

    DOI: 10.1007/978-3-319-20340-9_6  

  124. Automatic Parameter Tuning of Hierarchical Incremental Checkpointing Peer-reviewed

    Alfian Amrizal, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2014 8969 298-309 2015

    DOI: 10.1007/978-3-319-17353-5_25  

    ISSN: 0302-9743

  125. Optimized Data Transfers Based on the OpenCL Event Management Mechanism Peer-reviewed

    Hiroyuki Takizawa, Shoichi Hirasawa, Makoto Sugawara, Isaac Gelado, Hiroaki Kobayashi, Wen-mei W. Hwu

    SCIENTIFIC PROGRAMMING 2015 (576498) 1-16 2015

    DOI: 10.1155/2015/576498  

    ISSN: 1058-9244

    eISSN: 1875-919X

  126. Combining code refactoring and auto-tuning to improve performance portability of high-performance computing applications Peer-reviewed

    Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    The Sixth International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking (COMPUTATION TOOLS 2015) 20-26 2015

  127. Identification and elimination of platform-specific code smells in high performance computing applications Peer-reviewed

    Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    International Journal of Networking and Computing 5 (1) 180-199 2015

    Publisher: IJNC Editorial Committee

    DOI: 10.15803/ijnc.5.1_180  

    ISSN: 2185-2839

    More details Close

    A code smell is a code pattern that might indicate a code or design problem, which makes the application code hard to evolve and maintain. Automatic detection of code smells has been studied to help users find which parts of their application codes should be refactored. However, code smells have not been defined in a formal manner. Moreover, existing detection tools are designed mainly for object-oriented applications, but rarely provided for high performance computing (HPC) applications. HPC applications are usually optimized for a particular platform to achieve a high performance, and hence have special code smells called platform-specific code smells (PSCSs). The purpose of this work is to develop a code smell alert system to help users find PSCSs of HPC applications to improve the performance portability across different platforms. This paper presents a PSCS alert system that is based on an abstract syntax tree (AST) and XML. Code patterns of PSCSs are defined in a formal way using the AST information represented in XML. XML Path Language (XPath) is used to describe those patterns. A database is built to store the transformation recipes written in XSLT files for eliminating detected PSCSs. The recall and precision evaluation results obtained by using real applications show that the proposed system can detect potential PSCSs accurately. The evaluation on performance portability of real applications demonstrates that eliminating PSCSs leads to significant performance changes and therefore the code portions with detected PSCSs have to be refactored to improve the performance portability across multiple platforms.

  128. Xevolver を用いた自動チューニング

    平澤将一, 肖熊, 滝沢寛之, 小林広明

    計算工学会学会誌「計算工学」 20 (2) 14-17 2015

  129. An Energy-Efficient Dynamic Memory Address Mapping Mechanism Peer-reviewed

    Masayuki Sato, Chengguang Han, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2015 IEEE SYMPOSIUM ON LOW-POWER AND HIGH-SPEED CHIPS 1-3 2015

    DOI: 10.1109/CoolChips.2015.7158660  

  130. A Verification Framework for Streamlining Empirical Auto-tuning Peer-reviewed

    Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 508-514 2015

    DOI: 10.1109/CANDAR.2015.115  

    ISSN: 2379-1888

  131. Migration of an Atmospheric Simulation Code to an OpenACC Platform Using the Xevolver Framework Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken'ichi Itakura, Hiroaki Kobayashi

    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 515-520 2015

    DOI: 10.1109/CANDAR.2015.102  

    ISSN: 2379-1888

  132. Xevtgen: Fortran code transformer generator for high performance scientific codes Peer-reviewed

    Reiji Suda, Hiroyuki Takizawa, Shoichi Hirasawa

    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 528-534 2015

    DOI: 10.1109/CANDAR.2015.63  

    ISSN: 2379-1888

  133. A Case Study of User-Defined Code Transformations for Data Layout Optimizations Peer-reviewed

    Takeshi Yamada, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 535-541 2015

    DOI: 10.1109/CANDAR.2015.96  

    ISSN: 2379-1888

  134. Xevtgen: Fortran code transformer generator for high performance scientific codes Peer-reviewed

    Reiji Suda, Hiroyuki Takizawa, Shoichi Hirasawa

    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) 6 (2) 528-534 2015

    DOI: 10.1109/CANDAR.2015.63  

    ISSN: 2379-1888

  135. MVP-Cache: A Multi-Banked Cache Memory for Energy-Efficient Vector Processing of Multimedia Applications Peer-reviewed

    Ye Gao, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E97D (11) 2835-2843 2014/11

    DOI: 10.1587/transinf.2014EDP7227  

    ISSN: 1745-1361

  136. Early evaluation of the SX-ACE processor Peer-reviewed

    Ryusuke Egawa, Shintaro Momose, Kazuhiko Komatsu, Yoko Isobe, Hiroyuki Takizawa, Akihiro Musa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference 2014 (SC14) 1-2 2014/11

  137. ベクトル型メディアプロセッサの低消費電力化に関する研究

    宇野 渉, 高 也, 佐藤 雅之, 江川 隆輔, 滝沢 寛之, 小林 広明

    電気関係学会東北支部連合大会予稿集 2014/08

  138. キャッシュメモリにおけるスレッド間共有データの管理に関する研究

    西村 秦, 佐藤 雅之, 江川 隆輔, 滝沢 寛之, 小林 広明

    電気関係学会東北支部連合大会予稿集 2014/08

  139. Exploring system architectures for next-generation CFD simulations in the postpeta-scale era Peer-reviewed

    KOMATSU Kazuhiko, EGAWA Ryusuke, TAKIZAWA Hiroyuki, SOGA Takashi, MUSA Akihiro, KOBAYASHI Hiroaki

    Journal of Fluid Science and Technology 9 (5) JFST0073-JFST0073 2014

    Publisher: The Japan Society of Mechanical Engineers

    DOI: 10.1299/jfst.2014jfst0073  

    ISSN: 1880-5558

    More details Close

    CFD simulations with uniform grids have been paid attention as a next-generation CFD simulation on a large-scale supercomputing system. The Building-Cube Method (BCM) is one of the next-generation CFD methods. The basic idea is to balance loads of calculations among processing elements on a supercomputing system by dividing the whole calculations into many parallel tasks with the same amount of computation. Thus, it is suitable for highly parallel computation on supercomputing systems. This paper firstly implements BCM on five supercomputing systems as an example of a next-generation CFD simulation in the upcoming postpeta-scale era. Then, by theoretical analyses and performance evaluations, this paper clarifies the requirements of future supercomputing systems for a next-generation CFD simulation. The performance evaluations show that as the number of processing elements increases, the imbalance of data exchanges among nodes becomes more serious than that of calculations even in a next-generation CFD simulation. While the calculation time can ideally be reduced according to the number of processing elements, the data transfer time becomes dominant in the total execution time. Different from the massively-parallel system architecture, the number of nodes in a system should be as small as possible to prevent the data transfer. The performance analyses also show that the memory bandwidth limits the performance of BCM and use of an on-chip memory is effective to improve the performance. A memory subsystem that achieves a higher sustained memory bandwidth is required. Therefore, a supercomputing system that consists of a small number of high-performance nodes is essential to achieve high sustained performance of the next-generation CFD in the up coming postpeta-scale era by reducing the data transfers, which becomes eventually a bottleneck in large-scale simulation.

  140. A Platform-Specific Code Smell Alert System for High Performance Computing Applications Peer-reviewed

    Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) 653-662 2014

    DOI: 10.1109/IPDPSW.2014.76  

  141. On-chip checkpointing with 3D-stacked memories Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2014 International 3D Systems Integration Conference, 3DIC 2014 - Proceedings 1-6 2014

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/3DIC.2014.7152173  

  142. An Energy Optimization Method for Vector Processing Mechanisms Peer-reviewed

    Ye Gao, Masayuki Satoi, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2014 IEEE COOL CHIPS XVII 1-3 2014

    DOI: 10.1109/CoolChips.2014.6842957  

    ISSN: 2473-4683

  143. A compiler-assisted OpenMP migration method based on automatic parallelizing information Peer-reviewed

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8488 450-459 2014

    Publisher: Springer Verlag

    DOI: 10.1007/978-3-319-07518-1_30  

    ISSN: 1611-3349 0302-9743

  144. An Approach to Customization of Compiler Directives for Application-Specific Code Transformations Peer-reviewed

    Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2014 IEEE 8TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANYCORE SOCS (MCSOC) 99-106 2014

    DOI: 10.1109/MCSoC.2014.23  

  145. Xevolver: An XML-based Code Translation Framework for Supporting HPC Application Migration Peer-reviewed

    Hiroyuki Takizawa, Shoichi Hirasawa, Yasuharu Hayashi, Ryusuke Egawa, Hiroaki Kobayashi

    2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC) 1-11 2014

    DOI: 10.1109/HiPC.2014.7116902  

    ISSN: 1094-7256

  146. Xevolver: an XML-based programming framework for software evolution Peer-reviewed

    Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference 2013 (SC13) 1-2 2013/11

  147. An Automatic Performance Tracking System for Software Evolution Peer-reviewed

    Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing Systems 2013/10

  148. A Capacity-Aware Thread Scheduling Method Combined with Cache Partitioning to Reduce Inter-Thread Cache Conflicts Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E96D (9) 2047-2054 2013/09

    DOI: 10.1587/transinf.E96.D.2047  

    ISSN: 1745-1361

  149. ブロックバイパス機構によるキャッシュのエネルギ効率化に関する研究

    高井 拓実, 佐藤 雅之, 江川 隆輔, 滝沢 寛之, 小林 広明

    並列/分散/協調処理に関する「北九州」サマー・ワークショップ (SWoPP2013) 1-9 2013/07

  150. Performance Evaluation of a Next-Generation CFD on Various Supercomputing Systems

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2012 123-132 2013

    Publisher: Springer Berlin Heidelberg

    DOI: 10.1007/978-3-642-32454-3_11  

  151. Analysing the performance improvements of optimizations on modern HPC systems Peer-reviewed

    Kazuhiko Komatsu, Toshihide Sasaki, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Sustained Simulation Performance 2013 - Proceedings of the Joint Workshop on Sustained Simulation Performance 13-25 2013

    Publisher: Springer Science and Business Media, LLC

    DOI: 10.1007/978-3-319-01439-5-2  

  152. HPC refactoring with hierarchical abstractions to help software evolution Peer-reviewed

    Hiroyuki Takizawa, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda

    Sustained Simulation Performance 2012 - Proceedings of the Joint Workshop on High Performance Computing on Vector Systems, and Workshop on Sustained Simulation Performance 27-33 2013

    Publisher: Springer Science and Business Media, LLC

    DOI: 10.1007/978-3-642-32454-3-3  

  153. Performance evaluation of phase-based correspondence matching on GPUs Peer-reviewed

    Mamoru Miura, Kinya Fudano, Koichi Ito, Takafumi Aoki, Hiroyuki Takizawa, Hiroaki Kobayashi

    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXVI 8856 2013

    DOI: 10.1117/12.2023550  

    ISSN: 0277-786X

    eISSN: 1996-756X

  154. Checkpoint-Restart for Heterogeneous Computing Systems Invited

    滝沢寛之, 佐藤雅之, 江川隆輔, 小林広明

    Reliability Engineering Association of Japan 35 (8) 515 2013

    DOI: 10.11348/reajshinrai.35.8_515  

  155. A Flexible Insertion Policy for Dynamic Cache Resizing Mechanisms Peer-reviewed

    Masayuki Sato, Yusuke Tobo, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2013 IEEE COOL CHIPS XVI (COOL CHIPS) 1-3 2013

    DOI: 10.1109/CoolChips.2013.6547923  

    ISSN: 2473-4683

  156. ClMPI: An opencl extension for interoperation with the message passing interface Peer-reviewed

    Hiroyuki Takizawa, Makoto Sugawara, Shoichi Hirasawa, Isaac Gelado, Hiroaki Kobayashi, Wen-Mei W. Hwu

    Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013 1138-1148 2013

    Publisher: IEEE Computer Society

    DOI: 10.1109/IPDPSW.2013.183  

  157. A comparison of performance tunabilities between OpenCL and OpenACC Peer-reviewed

    Makoto Sugawara, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings - IEEE 7th International Symposium on Embedded Multicore/Manycore System-on-Chip, MCSoC 2013 147-152 2013

    Publisher: IEEE Computer Society

    DOI: 10.1109/MCSoC.2013.31  

  158. Design and Evaluation of a Media-oriented Vector Processor with a Multi-banked Cache Memory Peer-reviewed

    Ye Gao, Naold Shoji, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2013 IEEE 11TH SYMPOSIUM ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA (ESTIMEDIA) 78-87 2013

    DOI: 10.1109/ESTIMedia.2013.6704506  

    ISSN: 2325-1271

  159. Performance evaluation of BCM on various supercomputing systems Peer-reviewed

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    The 24th International Conference on Parallel Computational Fluid Dynamics 1-2 2012/11

  160. Performance Evaluation of BCM on Various Supercomputing Systems Peer-reviewed

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    In 24th International Conference on Parallel Computational Fluid Dynamics 2012/05/21

  161. ウェイ適応型キャッシュの高エネルギ効率化のためのデッドブロック早期追い出しポリシ Peer-reviewed

    東方 雄亮, 佐藤 雅之, 江川 隆輔, 滝沢 寛之, 小林 広明

    先進的計算基盤シンポジウムSACSIS2012 4-5 2012/05

  162. A Self-Organizing Overlay Network Mechanism Spreading Meta-Information of Resources Based on Users' Locality of Interests for Efficient Resource Discovery Peer-reviewed

    Tsutomu Inaba, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE trans. info. & syst. J95-D (5) 1110-1122 2012/05/01

    Publisher: The Institute of Electronics, Information and Communication Engineers

    ISSN: 1880-4535

  163. A bypass mechanism for way-adaptable caches Peer-reviewed

    Takumi Takai, Yusuke Tobo, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE COOL Chips XV 2012/04

  164. Performance and scalability analysis of a chip multi vector processor Peer-reviewed

    Yoshiei Sato, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    High Performance Computing on Vector Systems 2011 3-20 2012

    Publisher: Springer Science and Business Media, LLC

    DOI: 10.1007/978-3-642-22244-3-1  

  165. A prototype implementation of OpenCL for SX vector systems Peer-reviewed

    Hiroyuki Takizawa, Ryusuke Egawa, Hiroaki Kobayashi

    High Performance Computing on Vector Systems 2011 41-50 2012

    Publisher: Springer Science and Business Media, LLC

    DOI: 10.1007/978-3-642-22244-3-3  

  166. Exploring Design Space of a 3D Stacked Vector Cache Peer-reviewed

    Ryusuke Egawa, Yusuke Endo, Jubee Tada, Hiroyuki Takizawa, Hiroaki Kobayashi

    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC) 1477-1477 2012

  167. A capacity-efficient insertion policy for dynamic cache resizing mechanisms Peer-reviewed

    Masayuki Sato, Yusuke Tobo, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    CF '12 - Proceedings of the ACM Computing Frontiers Conference 265-267 2012

    DOI: 10.1145/2212908.2212949  

  168. A media-oriented vector architectural extension with a high bandwidth cache system Peer-reviewed

    Ye Gao, Naoki Shoji, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Symposium on Low-Power and High-Speed Chips - Proceedings for 2012 IEEE COOL Chips XV 1-3 2012

    DOI: 10.1109/COOLChips.2012.6216588  

  169. An out-of-order vector processing mechanism for multimedia applications Peer-reviewed

    Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    CF '12 - Proceedings of the ACM Computing Frontiers Conference 233-235 2012

    DOI: 10.1145/2212908.2212941  

  170. GPU IMPLEMENTATION OF PHASE-BASED STEREO CORRESPONDENCE AND ITS APPLICATION Peer-reviewed

    Mamoru Miura, Kinya Fudano, Koichi Ito, Takafumi Aoki, Hiroyuki Takizawa, Hiroaki Kobayashi

    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012) 1697-1700 2012

    DOI: 10.1109/ICIP.2012.6467205  

    ISSN: 1522-4880

  171. Improving the Scalability of Transparent Checkpointing for GPU Computing Systems Peer-reviewed

    Alfian Amrizal, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    TENCON 2012 - 2012 IEEE REGION 10 CONFERENCE: SUSTAINABLE DEVELOPMENT THROUGH HUMANITARIAN TECHNOLOGY 1-6 2012

    DOI: 10.1109/TENCON.2012.6412343  

    ISSN: 2159-3442

  172. Exploring Design Space of a 3D Stacked Vector Cache Peer-reviewed

    Ryusuke Egawa, Yusuke Endo, Hiroyuki Takizawa, Hiroaki Kobayashi, Jubee Tada

    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC) 1475-+ 2012

  173. A Network Clustering Algorithm for Sybil-Attack Resisting Peer-reviewed

    Ling Xu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E94D (12) 2345-2352 2011/12

    DOI: 10.1587/transinf.E94.D.2345  

    ISSN: 0916-8532

    eISSN: 1745-1361

  174. Performance of building cube method on various platforms Peer-reviewed

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    The 8th International Conference on Flow Dynamics 2011 (ICFD2011) 2011/11

  175. An automatic task assignment method for heterogeneous computing systems Peer-reviewed

    Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    The 8th International Conference on Flow Dynamics 2011 (ICFD2011) 2011/11

  176. Job Scheduling with Migration for Heterogeneous Computing Systems Peer-reviewed

    Kentaro Koyama, Katsuto Sato, Kazuhiko Komatsu, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing Systems 4 (4) 203-213 2011/10/05

    Publisher:

    ISSN: 1882-7829

  177. A patch-based bit mask ltering method for micropolygon rasterization Peer-reviewed

    Jiali Yao, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    High-Performance Graphics (HPG2011) 2011/08

  178. Performance of SOR methods on modern vector and scalar processors Peer-reviewed

    Takashi Soga, Akihiro Musa, Koki Okabe, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    COMPUTERS & FLUIDS 45 (1) 215-221 2011/06

    DOI: 10.1016/j.compfluid.2010.12.024  

    ISSN: 0045-7930

  179. Parallel processing of the Building-Cube Method on a GPU platform Peer-reviewed

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    COMPUTERS & FLUIDS 45 (1) 122-128 2011/06

    DOI: 10.1016/j.compfluid.2010.12.019  

    ISSN: 0045-7930

  180. A Performance Tuning Strategy Based on the Roofline Model for Vector Processors Peer-reviewed

    Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing Systems 4 (3) 77-87 2011/05/12

    ISSN: 1882-7772

  181. ウェイ適応型キャッシュのための低消費エネルギ指向挿入ポリシ Peer-reviewed

    東方 雄亮, 佐藤 雅之, 江川 隆輔, 滝沢 寛之, 小林 広明

    先進的計算基盤シンポジウムSACSIS2011 2011 213-214 2011/05

  182. Power-aware insertion policy for the way-adaptable caches Peer-reviewed

    Yusuke Tobo, Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE COOL Chips XIV 2011/04

  183. Energy Consumption of a Chip Multi-Vector Processor Using Real Applications

    永岡龍一, 佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 小林広明

    情報処理学会研究報告(CD-ROM) 2010 (5) ROMBUNNO.ARC-192,NO.3 2011/02/15

    ISSN: 2186-2583

  184. A High-performance Volunteer Computing Environment with a Dynamic Load-balancing Mechanism Peer-reviewed

    Yoshitomo Murata, Yuki Ishimori, Hiroyuki Takizawa, Hiroaki Kobayashi

    Transactions of Information Processing Society of Japan 52 (2) 401-414 2011/02/15

    Publisher:

    ISSN: 1882-7837

  185. Performance Evaluation of Real-Time Stereo Correspondence on GPU

    Tohoku-Section Joint Convention Record of Institutes of Electrical and Information Engineers, Japan 2011 31-31 2011

    Publisher: Organizing Committee of Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers, Japan

    DOI: 10.11528/tsjc.2011.0_31  

  186. A Self-Organized Overlay Network Management Mechanism for Heterogeneous Environments

    Inaba Tsutomu, Takizawa Hiroyuki, Kobayashi Hiroaki

    Information and Media Technologies 6 (2) 546-559 2011

    Publisher: Information and Media Technologies Editorial Board

    DOI: 10.11185/imt.6.546  

    More details Close

    The technologies of Cloud Computing and NGN are now growing a paradigm shift where various services are provided to business users over the network. In conjunction with this movement, many studies are active to realize a ubiquitous computing environment in which a huge number of individual users can share their computing resources on the Internet, such as personal computers (PCs), game consoles, sensors and so on. To realize an effective resource discovery mechanism for such an environment, this paper presents an adaptive overlay network that enables a self-organizing resource management system to efficiently adapt to a heterogeneous environment. The proposed mechanism is composed of two functions. One is to adjust the number of logical links of a resource, which forward search queries so that less-useful query flooding can be reduced. The other is to connect resources so as to decrease the communication latency on the physical network rather than the number of query hops on an overlay network. To further improve the discovery efficiency, this paper integrates these functions into a self-organizing resource management system, SORMS, which has been proposed in our previous work. The simulation results indicate that the proposed mechanism can increase the number of discovered resources by 60% without decreasing the discovery efficiency, and can reduce the total communication traffic by 80% compared with the original SORMS. This performance improvement is obtained by efficient control of logical links in a large scale network.

  187. NVCR: A transparent checkpoint-restart library for NVIDIA CUDA Peer-reviewed

    Akira Nukada, Hiroyuki Takizawa, Satoshi Matsuoka

    IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum 104-113 2011

    DOI: 10.1109/IPDPS.2011.131  

  188. Power-aware dynamic cache partitioning for CMPs Peer-reviewed

    Isao Kotera, Kenta Abe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6590 135-153 2011

    DOI: 10.1007/978-3-642-19448-1_8  

    ISSN: 0302-9743 1611-3349

  189. OpenCLにおけるタスク並列化支援のための実行時依存関係解析手法 Peer-reviewed

    佐藤功人, 小松一彦, 滝沢寛之, 小林広明

    情報処理学会論文誌 コンピューティングシステム(ACS) 5 (1) 53-67 2011/01

  190. A Runtime Dependency Analysis Method for Task Parallelization of OpenCL Programs Peer-reviewed

    Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing Systems 4 (5) 2011

  191. A self-organized overlay network management mechanism for heterogeneous environments Peer-reviewed

    Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    Journal of Information Processing 19 (0) 25-38 2011

    Publisher: Information Processing Society of Japan

    DOI: 10.2197/ipsjjip.19.25  

    ISSN: 1882-6652 0387-5806

  192. A history-based performance prediction model with profile data classification for automatic task allocation in heterogeneous computing systems Peer-reviewed

    Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings - 9th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2011 135-142 2011

    DOI: 10.1109/ISPA.2011.36  

  193. Effects of 3-D stacked vector cache on energy consumption Peer-reviewed

    Ryusuke Egawa, Yusuke Funaya, Ryuichi Nagaoka, Yusuke Endo, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2011 IEEE International 3D Systems Integration Conference, 3DIC 2011 2011

    DOI: 10.1109/3DIC.2012.6263026  

  194. CheCL: Transparent checkpointing and process migration of OpenCL applications Peer-reviewed

    Hiroyuki Takizawa, Kentaro Koyama, Katsuto Sato, Kazuhiko Komatsu, Hiroaki Kobayashi

    Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 864-876 2011

    DOI: 10.1109/IPDPS.2011.85  

  195. A performance tuning strategy under combining loop transforms for a vector processor with an on-chip cache Peer-reviewed

    Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC10) 2010/11

  196. Evaluating Performance and Portability of OpenCL Programs Peer-reviewed

    Kazuhiko Komatsu, Katsuto Sato, Yusuke Arai, Kentaro Koyama, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of the 5th international Workshop on Automatic Performance Tuning 1-15 2010/06

  197. Automatic tuning of CUDA execution parameters for stencil processing Peer-reviewed

    Katsuto Sato, Hiroyuki Takizawa, Kazuhiko Komatsu, Hiroaki Kobayashi

    Software Automatic Tuning: From Concepts to State-of-the-Art Results 209-228 2010

    Publisher: Springer New York

    DOI: 10.1007/978-1-4419-6935-4_13  

  198. Lessons Learned from 1-Year Experience with SX-9 and Toward the Next Generation Vector Computing Peer-reviewed

    Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiko Musa, Takashi Soga, Yoko Isobe

    HIGH PERFORMANCE COMPUTING ON VECTOR SYSTEMS 2009 3-+ 2010

    DOI: 10.1007/978-3-642-03913-3_1  

  199. Cache partitioning strategies for 3-D stacked vector processors Peer-reviewed

    Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE 3D System Integration Conference 2010, 3DIC 2010 1-6 2010

    DOI: 10.1109/3DIC.2010.5751453  

  200. A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications Peer-reviewed

    Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    13TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN: ARCHITECTURES, METHODS AND TOOLS 412-415 2010

    DOI: 10.1109/DSD.2010.93  

  201. Efficient data management for the building cube method using cartesian meshes on the GPU platform Peer-reviewed

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    International Supercomputing Conference (ISC10) 2010

  202. A majority-based control scheme for way-adaptable caches Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6310 16-28 2010

    DOI: 10.1007/978-3-642-16233-6_5  

    ISSN: 0302-9743 1611-3349

  203. Resisting sybil attack by social network and network clustering Peer-reviewed

    Ling Xu, Satayapiwat Chainan, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings - 2010 10th Annual International Symposium on Applications and the Internet, SAINT 2010 15-21 2010

    DOI: 10.1109/SAINT.2010.32  

  204. A Voting-Based Working Set Assessment Scheme for Dynamic Cache Resizing Mechanisms Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2010 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN 98-105 2010

    DOI: 10.1109/ICCD.2010.5647599  

    ISSN: 1063-6404

  205. Design and early evaluation of a 3-D die stacked chip multi-vector processor Peer-reviewed

    Ryusuke Egawa, Yusuke Funaya, Ryu-Ichi Nagaoka, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE 3D System Integration Conference 2010, 3DIC 2010 1-8 2010

    DOI: 10.1109/3DIC.2010.5751448  

  206. Performance of hemisphere algorithm for fast form factor calculation Peer-reviewed

    Noboru Yamada, Tomoaki Shinoda, Hiroyuki Takizawa

    Heat Transfer - Asian Research 38 (7) 450-463 2009/11

    DOI: 10.1002/htj.20259  

    ISSN: 1099-2871 1523-1496

  207. Performance Optimization Techniques for Vector Processors with Cache Memory

    佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    情報処理学会研究報告(CD-ROM) 2009 (3) ROMBUNNO.ARC-184,6 2009/10/15

    ISSN: 2186-2583

  208. Working Sets based Thread Scheduling with Cache Partitioning Peer-reviewed

    Masayuki Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Poster Abstracts of The Eighteenth International Conference on Parallel Architecture and Compilation Techniques 12 2009/09

  209. ワーキングセット評価に基づくスレッドスケジューリング

    佐藤 雅之, 小寺 功, 江川 隆輔, 滝沢 寛之, 小林 広明

    並列/分散/協調処理に関する「仙台」サマー・ワークショップ (SWoPP仙台2009) 1-10 2009/08

  210. Early evaluation of a memory-stacked vector processor Peer-reviewed

    Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE COOL Chips XII 165 2009/04

  211. A Cache-Aware Thread Scheduling Policy for Multi-Core Processors Peer-reviewed

    Masayuki Sato, Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    The IASTED International Conference on Parallel and Distributed Computing and Networks 2009/02

  212. 実アプリケーションによるSX‐9の性能評価

    曽我隆, 下村陽一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    情報処理学会シンポジウム論文集 2009 (2) 57-64 2009/01/15

    ISSN: 1344-0640

  213. Evaluating Computational Performance of Backpropagation Learning on Graphics Hardware Peer-reviewed

    Hiroyuki Takizawa, Tatsuya Chida, Hiroaki Kobayashi

    Electronic Notes in Theoretical Computer Science 225 (C) 379-389 2009/01/02

    DOI: 10.1016/j.entcs.2008.12.087  

    ISSN: 1571-0661

  214. 3D On-Chip Memory for the Vector Architecture Peer-reviewed

    Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2009 IEEE INTERNATIONAL CONFERENCE ON 3D SYSTEMS INTEGRATION 352-357 2009

    ISSN: 2164-0157

  215. Performance of Hemisphere Algorithm for Fast Form Factor Calculation Peer-reviewed

    Noboru YAMADA, Tomoaki SHINODA, Hiroyuki TAKIZAWA

    Transactions of the Japan Society of Mechanical Engineers B 075 (749) 132-139 2009/01

    Publisher: The Japan Society of Mechanical Engineers

    DOI: 10.1299/kikaib.75.749_132  

    ISSN: 0387-5016

    More details Close

    Development of fast and accurate algorithm of radiative heat transfer simulation is important in terms of efficient thermal design and simulation on diverse engineering area. This paper describes the performance of Hemisphere algorithm which has originally developed as a fast form factor calculation in the field of photorealistic three-dimensional computer graphics. We compared performance of the Hemisphere algorithm with two conventional methods which are frequently used in the field of radiative heat transfer simulation. As a result, the Hemisphere algorithm is significant faster than the conventional methods if one can accept an absolute error of 1.0×10^<-5>. In addition, the result indicates that the Hemisphere algorithm possibly suit for try and error process of large-scale model simulation due to its tolerable form factor distribution.

  216. Characteristics of an On-Chip Cache on NEC SX Vector Architecture Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Interdisciplinary Information Sciences 15 (1) 51-66 2009/01

    Publisher: Graduate School of Information Sciences, Tohoku University

    DOI: 10.4036/iis.2009.51  

    ISSN: 1340-9050

    More details Close

    Thanks to the highly effective memory bandwidth of the vector systems, they can achieve the high computation efficiency for computation-intensive scientific applications. However, they have been encountering the memory wall problem and the effective memory bandwidth rate has decreased, resulting in the decrease in the bytes per flop rates of recent vector systems from 4 (SX-7 and SX-8) to 2 (SX-8R) and 2.5 (SX-9). The situation is getting worse as many functions units and/or cores will be brought into a single chip, because the pin bandwidth is limited and does not scale. To solve the problem, we propose an on-chip cache, called vector cache, to maintain the effective memory bandwidth rate of future vector supercomputers. The vector cache employs a bypass mechanism between the main memory and register files under software controls. We evaluate the performance of the vector cache on the NEC SX vector processor architecture with bytes per flop rates of 2 B/FLOP and 1 B/FLOP, to clarify the basic characteristics of the vector cache. For the evaluation, we use the NEC SX-7 simulator extended with the vector cache mechanism. Benchmark programs for performance evaluation are two DAXPY-like loops and five leading scientific applications. The results indicate that the vector cache boosts the computational efficiencies of the 2 B/FLOP and 1 B/FLOP systems up to the level of the 4 B/FLOP system. Especially, in the case where cache hit rates exceed 50%, the 2 B/FLOP system can achieve a performance comparable to the 4 B/FLOP system. The vector cache with the bypass mechanism can provide the data both from the main memory and the cache simultaneously. In addition, from the viewpoints of designing the cache, we investigate the impact of cache associativity on the cache hit rate, and the relationship between cache latency and the performance. The results also suggest that the associativity hardly affects the cache hit rate, and the effects of the cache latency depend on the vector loop length of applications. The cache shorter latency contributes to the performance improvement of the applications with shorter loop lengths, even in the case of the 4 B/FLOP system. In the case of longer loop lengths of 256 or more, the latency can effectively be hidden, and the performance is not sensitive to the cache latency. Finally, we discuss the effects of selective caching using the bypass mechanism and loop unrolling on the vector cache performance for the scientific applications. The selective caching is effective for efficient use of the limited cache capacity. The loop unrolling is also effective for the improvement of performance, resulting in a synergistic effect with caching. However, there are exceptional cases; the loop unrolling worsens the cache hit rate due to an increase in the working space to process the unrolled loops over the cache. In this case, an increase in the cache miss rate cancels the gain obtained by unrolling.

  217. Performance tuning and analysis of future vector processors based on the roofline model Peer-reviewed

    Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    ACM International Conference Proceeding Series 7-14 2009

    DOI: 10.1145/1621960.1621962  

  218. CheCUDA: A Checkpoint/Restart Tool for CUDA Applications Peer-reviewed

    Hiroyuki Takizawa, Katsuto Sato, Kazuhiko Komatsu, Hiroaki Kobayashi

    2009 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT 2009) 408-+ 2009

    DOI: 10.1109/PDCAT.2009.78  

  219. Performance Evaluation of NEC SX-9 using Real Science and Engineering Applications Peer-reviewed

    Takashi Soga, Akihiro Musa, Youichi Shimomura, Ken'ichi Itakura, Koki Okabe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS 2009

    DOI: 10.1145/1654059.1654088  

  220. Auction-based Resource Allocation for Activating Incentives in Resource Trading in Grid Computing Peer-reviewed

    Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of The 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications 252-260 2008/12

  221. Caching on a chip multi vector processor Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC08) 2008/11

  222. SPRAT: A Stream Programming Language with Runtime Auto-Tuning Peer-reviewed

    Hiroyuki Takizawa, Hiroki Shiratori, Katuto Sato, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing System 1 (2) 207-220 2008/08

  223. A Reliability Model for Result Checking in Volunteer Computing Peer-reviewed

    Ling Xu, Hiroyuki Takizawa, Hiroaki Kobayashi

    SAINT2008 201-204 2008/07

    DOI: 10.1109/SAINT.2008.25  

  224. A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination Peer-reviewed

    Kazuhiro Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Transactions on Advanced Computing System 1 (1) 85-95 2008/04

  225. A Distributed and Cooperative Load Balancing Method for Large-Scale Computing Environments Peer-reviewed

    Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    IPSJ Journal 49 (3) 1214-1228 2008/03

  226. A Parallel Image Generation Algorithm based on Photon Map Partitioning Peer-reviewed

    Masahide Tamura, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of the 10th IASTED International Conference on Computer Graphics and Imaging (CGIM 2008) 145-151 2008/02

  227. An Efficient Intersection Algorithm Design of Ray Tracing For Many-Core Graphics Processors Peer-reviewed

    Kazuhiko Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of the 10th IASTED International Conference on Computer Graphics and Imaging (CGIM 2008) 165-171 2008/02

  228. First Experiences with NEC SX-9.

    Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiko Musa, Takashi Soga, Yoichi Shimomura

    High Performance Computing on Vector Systems 3-11 2008

    Publisher: Springer

    DOI: 10.1007/978-3-540-85869-0_1  

  229. Modeling of cache access behavior based on Zipf's law Peer-reviewed

    Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT 310 9-15 2008

    DOI: 10.1145/1509084.1509086  

    ISSN: 1089-795X

  230. A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination Peer-reviewed

    Komatsu Kazuhiko, Kaeriyama Yoshiyuki, Suzuki Kenichi, Takizawa Hiroyuki, Kobayashi Hiroaki

    IPSJ Online Transactions 1 (1) 1-11 2008

    Publisher: Information Processing Society of Japan

    DOI: 10.2197/ipsjtrans.1.1  

    ISSN: 1882-6660

    More details Close

    Although ray tracing is the best approach to high-quality image synthesis, much time is required to generate images due to its huge amount of computation. In particular, ray-primitive intersection tests still dominate the execution time required for ray tracing, and faster ray-primitive intersection algorithms are strongly required to interactively generate higher-quality images with more advanced effects. This paper presents a new fast algorithm for the intersection tests that makes a good use of ray and object coherence in ray tracing. The proposed algorithm utilizes the features whereby the rays in a bundle share the same origin and have massive coherence. By reducing the redundant calculations in the innermost intersection tests for the bundles by precomputation and early termination, the proposed algorithm accelerates the intersection tests. Experimental results show that the proposed algorithm achieves 1.43 times faster intersection tests compared with M&ouml;ller's algorithm by exploiting the features of the bundles of rays.

  231. The potential of on-chip memory systems for future vector architectures Peer-reviewed

    Hiroaki Kobayashi, Akihiko Musa, Yoshiei Sato, Hiroyuki Takizawa, Koki Okabe

    HIGH PERFORMANCE COMPUTING ON VECTOR SYSTEMS 2007 247-+ 2008

  232. A Utility-based Double Auction Mechanism for Efficient Grid Resource Allocation Peer-reviewed

    Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS 252-260 2008

    DOI: 10.1109/ISPA.2008.103  

  233. SPRAT: Runtime Processor Selection for Energy-aware Computing Peer-reviewed

    Hiroyuki Takizawa, Katuto Sato, Hiroaki Kobayashi

    2008 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING 386-393 2008

    ISSN: 1552-5244

  234. A Performance Study of Secure Data Mining on the Cell Processor Peer-reviewed

    Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

    CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS 633-+ 2008

    DOI: 10.1109/CCGRID.2008.16  

  235. Implementation and Evaluation of a Distributed and Cooperative Load-Balancing Mechanism for Dependable Volunteer Computing Peer-reviewed

    Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    2008 IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS & NETWORKS WITH FTCS & DCC 316-+ 2008

    DOI: 10.1109/DSN.2008.4630100  

    ISSN: 1530-0889

  236. Consideration of resource access history for optimizing overlay networks in P2P-based resource discovery Peer-reviewed

    Tsutomu Inaba, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayash

    Proceedings - 2008 International Symposium on Applications and the Internet, SAINT 2008 269-272 2008

    DOI: 10.1109/SAINT.2008.104  

  237. SPRAT: Runtime processor selection for energy-aware computing Peer-reviewed

    Hiroyuki Takizawa, Katuto Sato, Hiroaki Kobayashi

    Proceedings - IEEE International Conference on Cluster Computing, ICCC 2008 386-393 2008

    Publisher: Institute of Electrical and Electronics Engineers Inc.

    DOI: 10.1109/CLUSTR.2008.4663799  

    ISSN: 1552-5244

  238. A shared cache for a chip multi vector processor Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Takashi Soga, Koki Okabe, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT 310 24-29 2008

    DOI: 10.1145/1509084.1509088  

    ISSN: 1089-795X

  239. Effects of MSHR and Prefetch Mechanisms on an On-Chip Cache of the Vector Architecture Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS 335-+ 2008

    DOI: 10.1109/ISPA.2008.100  

  240. A Progressive 3-D Meshing Algorithm for Interactive Simulation of Soft Bodie Peer-reviewed

    SAOI Tomoyuki, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

    Journal of INFORMATION 10 (6) 761-776 2007/12

  241. A dependable Peer-to-Peer computing platform Peer-reviewed

    Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 23 (8) 939-955 2007/11

    DOI: 10.1016/j.future.2007.03.004  

    ISSN: 0167-739X

    eISSN: 1872-7115

  242. Early evaluation of on-chip vector caching for the NEC SX vector architecture Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC07) 2007/11

  243. An Efficient Control Mechanism for Self-Organizing Overlay Networks of Large-Scale P2P Systems Peer-reviewed

    Hiroaki Kobayashi, Hiroyuki Takizawa, Takuro Okawa, Tsutomu Inaba

    Interdisciplinary Information Sciences 13 (2) 227-237 2007/09/18

    Publisher: Tohoku University

    DOI: 10.4036/iis.2007.227  

    ISSN: 1340-9050

    More details Close

    P2P (Peer to Peer) has a great potential to handle highly-distributed computing resources and is expected to be a key technology to realize ubiquitous computing environments over the Internet. However, P2P systems tend to waste the network bandwidth for resource acquisition because of their decentralized resource management. This paper presents an efficient control mechanism for self-organizing overlay networks of large-scale P2P systems, and evaluate its performance in detail. The overlay network is configured by making local clusters reflect current interests of individual peers and connecting them together based on their similarity. As a result, the overlay network provides the resource exploitation space for some specific interests. In addition, the overlay network can dynamically be reconfigured based on the change in the interests of individual peers across time so that more useful peers at that time can be reconnected closer to their client peers. Therefore, multicasting of resource requesting messages can be carried out only over peers with similar interests that are dynamically connected through the overlay network, resulting in a remarkable decrease in both messages for resource acquisition and hops a resource requesting query travels to reach the peer that satisfies the request. Experimental results indicate that the proposed mechanism can realize effective self-organization of the overlay network in which useful peers are dynamically relocated around client peers. In addition, the adaptive allocation of links to peers according to their capability works well to keep the higher performance and fault-tolerance of the self-organizing overlay network.

  244. A Power-Aware Shared Cache Mechanism ased on Locality Assessment of memory Reference for CMPs Peer-reviewed

    Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of the 8th MEDEA workshop 121-128 2007/09/16

  245. Analysis of hardware resource conflicts for runtime performance prediction of SMT processors Peer-reviewed

    Masayuki Sato, Yusuke Funaya, Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

    Information Technology Letters 6 67-70 2007/09/05

  246. A Power-Aware and Way-Allocatable Shared Cache Mechanism Peer-reviewed

    Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

    Information Technology Letters 6 55-58 2007/09/05

  247. Partial distortion entropy maximization for online data clustering Peer-reviewed

    Hiroyuki Takizawa, Hiroaki Kobayashi

    NEURAL NETWORKS 20 (7) 819-831 2007/09

    DOI: 10.1016/j.neunet.2007.04.029  

    ISSN: 0893-6080

  248. An Estimation-Based Redundant Task Dispatch Policy for Volunteer Computing Platforms Peer-reviewed

    Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of the International Conference on Dependable Systems and Networks 348-349 2007/06/25

    More details Close

    Fast Abstract (Supplemental Volume)

  249. A fair-sharing and power-aware L2 cache system for chip multiprocessors Peer-reviewed

    Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE COOL Chips X 2007/04

  250. A power-aware shared cache mechanism based on locality assessment of memory reference for CMPs Peer-reviewed

    Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT 113-120 2007

    DOI: 10.1145/1327171.1327185  

    ISSN: 1089-795X

  251. Preliminary evaluation for runtime auto-tuning of GPGPU applications Peer-reviewed

    Hiroyuki Takizawa, Hiroki Shiratori, Hiroaki Kobayashi

    The 2nd International Workshop on Automatic Performance Tuning 37-37 2007

  252. Performance Evaluation of K-Means Clustering on the Cell Processor Peer-reviewed

    Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

    Proceedings of High Performance Computing Symposium 2007 (1) 161-168 2007/01

  253. A memory-efficient scheme for fast spectral photon mapping Peer-reviewed

    Kosuke Ikeda, Hiroyuki Takizawa, Hiroaki Kobayashi

    PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS AND IMAGING 75-80 2007

  254. An on-chip cache design for vector processors Peer-reviewed

    Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT 17-23 2007

    DOI: 10.1145/1327171.1327173  

    ISSN: 1089-795X

  255. An estimation-based redundant task dispatch policy for volunteer computing platforms Peer-reviewed

    Hong Wang, Hiroyuki Takizawa, Hiroaki Kobayashi

    The International Conference on Dependable Systems and Networks 348-349 2007

  256. A Dynamic Logical Link Management Mechanism for P2P Resource Discovery System Peer-reviewed

    Takuro Okawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Information Technology Letters 5 363-366 2006/09

    Publisher: Forum on Information Technology

  257. Thread Scheduling Based on the Thread Characteristics for Multi-Core Processors Peer-reviewed

    Yusuke Funaya, Isao Kotera, Hiroyuki Takizawa, Hiroaki Kobayashi

    Information Technology Letters 5 37-40 2006/09

  258. Towards Effective GPU Implementation of Neural Networks Peer-reviewed

    Hiroyuki Takizawa, Tatsuya Chida, Hiroaki Kobayashi

    The 4th International Conference on Information-MFCSIT’06 408-411 2006/08

  259. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing Peer-reviewed

    H Takizawa, H Kobayashi

    JOURNAL OF SUPERCOMPUTING 36 (3) 219-234 2006/06

    DOI: 10.1007/s11227-006-8294-1  

    ISSN: 0920-8542

  260. Design and Implementation of an Efficient Search Mechanism based on the Hybrid P2P Model for Ubiquitous Computing Systems Peer-reviewed

    T Inaba, T Okawa, Y Murata, H Takizawa, H Kobayashi

    INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET , PROCEEDINGS 45-+ 2006

    DOI: 10.1109/SAINT.2006.23  

  261. A distributed and cooperative load balancing mechanism for large-scale P2P systems Peer-reviewed

    Y Murata, T Inaba, H Takizawa, H Kobayashi

    INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET WORKSHOPS, PROCEEDINGS 126-129 2006

    DOI: 10.1109/SAINT-W.2006.2  

  262. Radiative heat transfer simulation using programmable graphics hardware Peer-reviewed

    Hiroyuki Takizawa, Noboru Yamada, Seigo Sakai, Hiroaki Kobayashi

    Proceedings - 5th IEEE/ACIS Int. Conf. on Comput. and Info. Sci., ICIS 2006. In conjunction with 1st IEEE/ACIS, Int. Workshop Component-Based Software Eng., Softw. Archi. and Reuse, COMSAR 2006 2006 29-37 2006

    DOI: 10.1109/ICIS-COMSAR.2006.70  

  263. Implications of memory performance for highly efficient supercomputing of scientific applications Peer-reviewed

    Akihiro Musa, Hiroyuki Takizawa, Koki Okabe, Takashi Soga, Hiroaki Kobayashi

    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS 4330 845-+ 2006

    ISSN: 0302-9743

  264. Evaluation and Modeling of Resource Discovery in Large Scale P2P Systems Peer-reviewed

    Takurou Okawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Forum of Information Technology(FIT2005) Information Technology Letters 4 21-24 2005/09

    Publisher: Forum on Information Technology

  265. Performance Evaluation of the SX-7 System Using the HPC Challenge Benchmark Peer-reviewed

    Hiroyuki Takizawa, Tatsunobu Kokubo, Kenryo Kataumi, Hiroaki Kobayashi

    IPSJ journal 46 (SIG 12(ACS 11)) 37-45 2005/08

    More details Close

    Also presented at SASCIS2005(May 2005)

  266. An Incremental Photon-Mapping Algorithm for Fast Walk-Through Animations Peer-reviewed

    Kosuke Ikeda, Hiroyuki Takizawa, Hiroaki Kobayashi

    Computer Graphics and Imaging (CGIM 2005) 2005/08

  267. Locality Analysis to Control Dynamically Way-Adaptable Caches Peer-reviewed

    KOBAYASHI Hiroaki, KOTERA Isao, TAKIZAWA Hiroyuki

    ACM SIGARCH Computer Architecture News 33 (3) 25-32 2005/06

    DOI: 10.1145/1101868.1101874  

  268. Evaluation of Large-Scale Remote Interactive Visualization via Super SINET Peer-reviewed

    TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

    Journal of INFORMATION 8 (3) 383-390 2005/05

  269. Performance Evaluation of the SX-7 System Using the HPC Challenge Benchmark Peer-reviewed

    Hiroyuki Takizawa, Tatsunobu Kokubo, Kenryo Kataumi, Hiroaki Kobayashi

    Symposium on Advanced Computing Systems and Infrastructures(SACSIS2005) 2005 (5) 25-33 2005/05

  270. A distributed cooperative scheduling mechanism for P2P computing

    Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    Advanced Network & Computing Technology Workshop (33) 23-30 2005/01/24

  271. A self-organizing overlay network to exploit the locality of interests for effective resource discovery in P2P systems Peer-reviewed

    H Kobayashi, H Takizawa, T Inaba, Y Takizawa

    2005 SYMPOSIUM ON APPLICATIONS AND THE INTERNET, PROCEEDINGS 246-255 2005

  272. A P2P Semantic Information Search Mechanism for Ubiquitous Grid Computing Systems

    Tsutomu Inaba, Takuro Okawa, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

    Advanced Network & Computing Technology Workshop (33) 45-52 2005/01

  273. A workflow management mechanism for peer-to-peer computing platforms Peer-reviewed

    H Wang, H Takizawa, H Kobayashi

    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS 3758 827-832 2005

    ISSN: 0302-9743

  274. Efficient parallel processing of competitive learning algorithms Peer-reviewed

    K Sano, S Momose, H Takizawa, H Kobayashi, T Nakamura

    PARALLEL COMPUTING 30 (12) 1361-1383 2004/12

    DOI: 10.1016/j.parco.2004.10.001  

    ISSN: 0167-8191

    eISSN: 1872-7336

  275. Evaluation of Large-Scale Remote Interactive Visualization via Super SINET Peer-reviewed

    TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

    The 3rd International Conference on Information (INFO'2004) 3 2004/11

  276. スーパーSINETを介した大規模遠隔対話的可視化の評価実験

    滝沢寛之, 小林広明

    全国共同利用情報基盤センター研究開発論文集 26 24-29 2004/11

  277. An Effective Control Mechanism for Way-Adaptable Caches

    KOTERA Isao, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

    電気関係学会東北支部連合大会 2004/08

  278. スーパーSINETを利用した大規模遠隔可視化処理の評価

    滝沢寛之, 小林広明

    東北大学情報シナジーセンター年報 3 90-96 2004/06

    Publisher:

  279. グリッドミドルウェアGlobusの資源探索と通信に関するオーバヘッドの定量的評価

    村田善智, 稲葉勉, 滝沢寛之, 小林広明

    東北大学情報シナジーセンター年報 3 115-123 2004/06

    Publisher:

  280. An Effective Implementation of Vector Quantization Encoder on Commodity Graphics Hardware Peer-reviewed

    Hiroyuki TAKIZAWA, Hiroaki KOBAYASHI

    Proceedings of the 2nd International Conference on Information Technology and Applications(ICITA2004) 2004/01

  281. A fast computation scheme of partial distortion entropy updating Peer-reviewed

    H Takizawa, F Kobayashi

    ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 1, PROCEEDINGS 1 736-741 2004

    DOI: 10.1109/ITCC.2004.1286555  

  282. Multi-grain parallel processing of data-clustering on programmable graphics hardware Peer-reviewed

    H Takizawa, H Kobayashi

    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS 3358 (3358) 16-27 2004

    ISSN: 0302-9743

  283. グリッド用動的資源管理のための自己組織化P2Pネットワークに関する一検討

    瀧澤泰明, 滝沢寛之, 佐野健太郎, 小林広明, 中村維男

    情報処理学会東北支部研究会 2003/11

  284. Vector Quantization Codebook Design Restraining Edge Degradation of Images Peer-reviewed

    TAKIZAWA Hiroyuki, MIURA Takeshi, KOBAYASHI Hiroaki, NAKAMURA Tadao

    FIT2003 Information Technology Letters 2 (2) 243-244 2003/09

  285. Vector quantization codebook design using the law-of-the-jungle algorithm Peer-reviewed

    H Takizawa, T Nakajima, K Sano, H Kobayashi, T Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E86D (6) 1068-1077 2003/06

    ISSN: 0916-8532

  286. A Comparison Study Of Vector Quantization Codebook Design Algorithms Based On The Equidistortion Principle Peer-reviewed

    Hiroyuki Takizawa, Taira Nakajima, Kentaro Sano, Hiroaki Kobayashi, Tadao Nakamura

    Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics(AI2003) 255-261 2003/03

  287. A Decision Criterion to Relocate Codewords for Adaptive Vector Quantization Peer-reviewed

    H. Takizawa

    Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics(AI2003) 262-268 2003/02

  288. Parallel Algorithm for the Law-of-the-Jungle Learning to the Fast Design of Optimal Codebooks Peer-reviewed

    Kentaro Sano, Shintaro Momose, Hiroyuki Takizawa, Taira Nakajima, Clecio Donizete Lima, Hiroaki Kobayashi, Tadao Nakamura

    Proceedings of the 14th IASTED International Conference on Parallel and Distributed Computing and Systems(PDCS2002) 723-728 2002/11

  289. Practical Volume Compression based on Vector Quantization using the Law-of-the-Jungle Algorithm Peer-reviewed

    Kentaro Sano, Hiroyuki Takizawa, Taira Nakajima, Hiroaki Kobayashi, Tadao Nakamura

    Proceedings of the 2nd IASTED International Conference on Visualization, Imaging and Image Processing(VIIP2002) 519-526 2002/09

  290. A Vector Quantizer preventing Image Degradation Peer-reviewed

    Takeshi Miura, Hiroyuki Takizawa, Kentaro Sano, Taira Nakajima, Hiroaki Kobayashi, Tadao Nakamura

    FIT Information Technology Letters 185-186 2002/09

  291. Parallel processing for vector quantization codebook design

    S. Momose, K. Sano, H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

    並列/協調/分散処理に関する「湯布院」サマーワークショップ資料 2002/08

  292. Updated Computer Systems of Integrated Information Processing Center, Niiagta University

    Hiroyuki Takizawa

    Yearly report of Integrated Information Processing Center, Niigata University (13) 21-27 2002/03

  293. PC-UNIX導入時の不正アクセス対策

    滝沢寛之

    新潟大学総合情報処理センター年報NIICE 12 (12) 13-19 2001/03

    Publisher:

  294. An active learning algorithm based on existing training data Peer-reviewed

    H Takizawa, T Nakajima, H Kobayashi, T Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E83D (1) 90-99 2000/01

    ISSN: 0916-8532

  295. A topology preserving neural network for nonstationary distributions Peer-reviewed

    T Nakajima, H Takizawa, H Kobayashi, T Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E82D (7) 1131-1135 1999/07

    ISSN: 0916-8532

  296. A self-organizing network system forming memory from nonstationary probability distributions Peer-reviewed

    T. Nakajima, H. Takizawa, H. Kobayashi, T. Nakamura

    Proceedings of IJCNN99 1999/07

  297. Acceleration techniques for the network inversion algorithm Peer-reviewed

    H Takizawa, T Nakajima, M Nishi, H Kobayashi, T Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E82D (2) 508-511 1999/02

    ISSN: 0916-8532

  298. Application of the neural network (BPD with cross-talk links) to FSK demodulation Peer-reviewed

    M. Nishi, J. Furuya, H. Takizawa, T. Nakamura

    The trans. of the Japanese society of technical education 41 (1) 9-16 1999/01

  299. Kohonen learning with a mechanism, the law of the jungle, capable of dealing with nonstationary probability distribution functions Peer-reviewed

    T Nakajima, H Takizawa, H Kobayashi, T Nakamura

    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E81D (6) 584-591 1998/06

    ISSN: 0916-8532

  300. Facial image processing using wavelet transform

    K. Iimura, H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

    Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers 1998

  301. A method for improving classification capability of multilayer perceptrons Peer-reviewed

    H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

    The trans. of the IEICE J80-D-II (1) 390-393 1997/01

  302. Facial expression recognition using neural networks capable of recognizing at an infant level Peer-reviewed

    T. Nakajima, H. Takizawa, M. Simamura, H. Kobayashi, T. Nakamura

    Proceedings of WAIMH 6th Congress 66-0 1996/07

  303. A study of optimal learning methods in neural networks

    H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

    IPSJ Regional Symposium in Tohoku 1996

  304. An automatic facial expression recognition system using neural networks

    T. Nakajima, H. Takizawa, M. Shimamura, H. Kobayashi, T. Nakamura

    IEICE Society Conference 1995

  305. Facial image recognition using neural networks

    H. Takizawa, T. Nakajima, H. Kobayashi, T. Nakamura

    Tohoku-Section Joint Convention of Institutes of Electrical and Information Engineers 1995

Show all ︎Show first 5

Misc. 57

  1. ベクトルプロセッサからFPGAへのタスクオフロードに関する一考察

    土方康平, 上野知洋, 江川隆輔, 滝沢寛之, 佐野健太郎

    電子情報通信学会技術研究報告 119 (371(VLD2019 54-93)) 2020

    ISSN: 0913-5685

  2. RDMAを用いた密結合FPGAクラスタのメモリ間通信性能

    上野知洋, 佐野健太郎, 土方康平, 滝沢寛之

    電子情報通信学会技術研究報告 119 (18(RECONF2019 1-19)(Web)) 2019

    ISSN: 0913-5685

  3. HPCMG-FVを用いたSX-ACEの性能評価

    江川隆輔, 磯部洋子, 加藤季広, 小松一彦, 滝沢寛之, 小林広明, 撫佐昭裕

    東北大学情報シナジーセンター大規模科学計算機システム広報SENAC 50 (3) 15-18 2017/07

    Publisher: 東北大学サイバーサイエンスセンター

    ISSN: 0286-7419

  4. Xevolverによる大気・海洋結合マルチスケールモデルMSSGの性能最適化コード管理の評価

    板倉 憲一, 小松 一彦, 江川 隆輔, 滝沢 寛之

    ハイパフォーマンスコンピューティングと計算科学シンポジウム論文集 (2017) 12-12 2017/05/29

  5. 計算科学・計算機科学人材育成のためのスーパーコンピュータ無償提供利用報告 情報科学研究科 超高速情報処理論利用報告

    滝沢寛之, 江川隆輔, 後藤英昭

    東北大学情報シナジーセンター大規模科学計算機システム広報SENAC 50 (3) 23-27 2017

  6. SX-ACEにおけるHPCG ベンチマークの性能評価

    小松 一彦, 江川 隆輔, 磯部 洋子, 緒方 隆盛, 滝沢 寛之, 小林 広明

    SENAC : 東北大学大型計算機センター広報 48 (3) 14-19 2015/07

    Publisher: 東北大学サイバーサイエンスセンター

    ISSN: 0286-7419

  7. 東北大学サイバーサイエンスセンター高速化推進研究活動報告書(第6号)

    小林広明, 岡部公起, 滝沢寛之, 江川隆輔, 小松一彦, 大泉健治, 小野 敏, 山下毅, 佐々木大輔, 森谷友映, 齋藤敦子, 撫佐昭裕, 松岡浩司, 渡部修 他

    2015/04

  8. Auto-Tuning with Xevolver

    20 (2) 3258-3261 2015

    Publisher: 日本計算工学会

    ISSN: 1341-7622

  9. Xevolverを用いた自動チューニング

    平澤将一, 肖熊, 滝沢寛之, 小林広明

    計算工学会学会誌「計算工学」 20 (2) 14-17 2015

  10. Heuristic Data Partitioning for Social Networking Service

    2013 (34) 1-8 2013/12/09

  11. マルチプラットフォームにおける最適化手法の効果に関する一検討

    小松一彦, 佐々木俊英, 江川隆輔, 滝沢寛之, 小林広明

    研究報告ハイパフォーマンスコンピューティング(HPC) 2013 (24) 1-7 2013/07/24

    Publisher: 一般社団法人情報処理学会

    More details Close

    近年,HPC システムの多様化が進んでおり,特徴の異なる複数種類の HPC システムにおいて高い性能を引き出すことができる,性能可搬性の高い HPC コードの開発が強く求められている.本研究では,各種 HPC システム向けの最適化手法が HPC コードの性能に与える効果を詳細に解析し,その知見に基づいて性能可搬性の高い HPC コードを開発することを目的としている.本報告では,異なる手動最適化同士や自動最適化を組み合わせた場合の HPC コードの性能可搬性を解析する.HPC システムごとに,それぞれの手動最適化同士や自動最適化の組み合わせによる相乗効果を評価し,性能可搬性の低下を引き起こす可能性のある最適化について議論する.

  12. チューニング対象の限定による効率の良い性能可搬性向上手法

    平澤将一, 秋葉諒, 滝沢寛之, 小林広明

    研究報告ハイパフォーマンスコンピューティング(HPC) 2013 (19) 1-8 2013/05/22

    Publisher: 一般社団法人情報処理学会

    More details Close

    計算システムの多様化に伴い,既存の科学技術計算プログラムを新たな計算システムへ移植し性能を最適化する作業がしばしば求められている.しかしながら大規模な科学技術計算プログラムの移植および性能最適化には多大な労力が必要となり,問題となっている.本研究では,性能可搬性向上を目的とした場合に優先的に性能最適化を行うべきソースコードの箇所を限定し,効率良くアプリケーション全体の性能可搬性を向上させる手法を提案する.ベンチマークプログラムおよび実アプリケーションによる評価の結果,提案手法はアプリケーション全体の性能可搬性を効率よく向上させるために,最適化すべきソースコードの部位を限定できることが示された.

  13. Message from the chairs of iWAPT 2012

    Hiroyuki Takizawa, Richard Vuduc, Takeshi Iwashita

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7851 2013

    DOI: 10.1007/978-3-642-38718-0  

    ISSN: 0302-9743 1611-3349

  14. 複合システムにおけるチェックポイントリスタート

    滝沢寛之, 佐藤雅之, 江川隆輔, 小林広明

    日本信頼性学誌 35 (7) 2013

    DOI: 10.11348/reajshinrai.35.8_515  

  15. 統合開発環境と連携するポータブルなビルドシステム

    平澤将一, 滝沢寛之, 小林広明

    研究報告ハイパフォーマンスコンピューティング(HPC) 2012 (28) 1-8 2012/09/26

    More details Close

    本研究では,性能可搬性を保ちつつアプリケーションを開発するためのフレームワーク構築に向けて,ポータブルなビルドシステムを開発する.現在の高性能計算 (High-Performance Computing, HPC) システムの構成は複雑化しており,アプリケーションを実行せずにその実効性能を予測することは困難である.このため本研究では,開発中のアプリケーションを定期的に実行し,その性能プロファイルを暗黙裡に取得して性能可搬性の低い個所を特定し,プログラマに対話的に提示することにより性能可搬性の維持を支援することを想定している.そのようなアプリケーション開発補助ツールを実現するためには,開発中のアプリケーションを暗黙裡に様々なシステム上でビルドし,実行する機能が必要である.本研究では,そのような可搬性を有するビルドシステムを開発し,アプリケーション開発支援環境として必要な機能を議論する.

  16. Implementation and Evaluation of the Nanopowder Growth Simulation with OpenACC

    2012 (10) 1-7 2012/09/26

  17. 大規模計算システムにおけるBCMの性能評価

    小松 一彦, 曽我 隆, 江川 隆輔, 滝沢 寛之, 小林 広明

    SENAC : 東北大学大型計算機センター広報 45 (3) 17-25 2012/07

    Publisher: 東北大学サイバーサイエンスセンター

    ISSN: 0286-7419

  18. Evaluation of GPU Computing Based on An Automatic Program Generation Technology

    2011 (18) 1-7 2011/07/20

  19. A Client-Level Deadline Scheduling Strategy for Volunteer Computing Systems

    2011 45-54 2011/05/18

  20. A Performance Tuning Strategy Based on the Roofline Model for Vector Processors

    4 (3) 77-87 2011/05/12

    ISSN: 1882-7829

  21. 東北大学サイバーサイエンスセンター高速化推進研究活動報告書(第5号)

    小林広明, 岡部公起, 滝沢寛之, 江川隆輔, 伊藤英一, 大泉健治, 小野 敏, 小久保達信, 橋本ユキ子, 磯部洋子, 撫佐昭裕, 神山 典, 金野浩伸

    2011/04

  22. チップマルチベクトルプロセッサのためのプログラム最適化技術

    佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    東北大学情報シナジーセンター大規模科学計算機システム広報SENAC 44 (2) 29-36 2011/04

  23. A Self-Organized Overlay Network Management Mechanism for Heterogeneous Environments

    Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    52 (2) 320-333 2011/02/15

    Publisher: 情報処理学会

    ISSN: 1882-7764

  24. Energy Consumption of a Chip Multi-Vector Processor Using Real Applications

    2010 (3) 1-8 2010/12/09

    Publisher: 情報処理学会

    ISSN: 1884-0930

  25. An Out-of-order Vector Processing Mechanism for Multimedia Applications

    GAO YE, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

    2010 (24) 1-10 2010/07/27

    Publisher: 情報処理学会

    ISSN: 0919-6072

  26. Performance Evaluation of GPU Computing with OpenCL

    ARAI YUSUKE, SATO KATSUTO, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

    2010 (11) 1-7 2010/02/15

    Publisher: 情報処理学会

    ISSN: 0919-6072

  27. Implementation and Evaluation of a Checkpint/Restart Tool for CUDA Applications

    TAKIZAWA HIROYUKI, SATO KATSUTO, KOMATSU KAZUHIKO, KOBAYASHI HIROAKI

    122 (7) G1-G7 2009/10/09

    Publisher: 情報処理学会

    ISSN: 0919-6072

  28. RC-008 Client-Level Task Scheduling for Effective Volunteer Computing

    Murata Yoshitomo, Endo Toshiaki, Takizawa Hiroyuki, Kobayashi Hiroaki

    8 (1) 165-172 2009/08/20

    Publisher: Forum on Information Technology

  29. C-024 An Auction based Resource Allocation Considering Multifaceted Utilities in a Peer to Peer Environment

    Satayapiwat Chainan, Komatsu Kazuhiko, Egawa Ryusuke, Takizawa Hiroyuki, Kobayashi Hiroaki

    8 (1) 491-494 2009/08/20

    Publisher: Forum on Information Technology

    More details Close

    Recently, many market-based approaches have been studied as one of the promising alternatives in a resource allocation problem. Especially, auction-based approaches are widely chosen due to its distributed nature and its relatively lower complexity. However, employing an auction to allocate jobs is only suitable for homogeneous environments of resources. This paper proposes an auction-based resource allocation mechanism which enables resource allocation in a heterogeneous environment while minimizing user's inputs. Our preliminary results show that our resource allocation mechanism improves the performance of important jobs during high-loaded.

  30. C-023 Performance Evaluation towards BLAS with Automatic Processor Selection

    Komatsu Kazuhiko, Koyama Kentaro, Sato Katsuto, Takizawa Hiroyuki, Kobayashi Hiroaki

    8 (1) 485-490 2009/08/20

    Publisher: Forum on Information Technology

  31. Performance Optimization Techniques for Vector Processors with Cache Memory

    SATO YOSHIEI, NAGAOKA RYUICHI, MUSA AKIHIRO, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, OKABE KOKI, KOBAYASHI HIROAKI

    2009 (6) 1-10 2009/07/28

    Publisher: 情報処理学会

    ISSN: 0919-6072

  32. SX-9による大規模並列シミュレーション(3.2 第7回情報シナジー研究会, 3. 研究活動報告)

    曽我 隆, 下村 陽一, 撫佐 昭裕, 江川 隆輔, 滝沢 寛之, 岡部 公起, 小林 広明, 高橋 俊, 中橋 和博

    年報 8 88-93 2009/07

    Publisher: 東北大学サイバーサイエンスセンター

  33. Software Automatic Tuning Technologies for Scientific and Technical Computing : Software Automatic Tuning in GPU Computing

    TAKIZAWA Hiroyuki

    IPSJ Magazine 50 (6) 527-531 2009/06/15

    Publisher: Information Processing Society of Japan (IPSJ)

    ISSN: 0447-8053

  34. Software Automatic Performance Tuning in GPU Computing

    Hiroyuki TAKIZAWA

    Journal of Information Processing Society of Japan 50 (6) 527-531 2009/06/15

  35. 創造工学研修の実施報告 ― スパコンを使って計算科学・計算機科学のおもしろさを体験 ―

    滝沢 寛之, 江川 隆輔, 笹尾 泰洋, 佐野健太郎, 山本 悟, 小林 広明

    東北大学サイバーサイエンスセンター 大規模科学計算システム広報SENAC 42 (2) 87-90 2009/02

  36. 624 A study of energy-aware GPU computing

    Takizawa Hiroyuki, Sato Katuto, Kobayashi Hiroaki

    The Computational Mechanics Conference 2008 (21) 558-559 2008/11/01

    Publisher: The Japan Society of Mechanical Engineers

    ISSN: 1348-026X

  37. RC-006 Hardware Design of A Way-Allocatable Shared Cache Mechanism

    Abe Kenta, Kotera Isao, Egawa Ryusuke, Takizawa Hiroyuki, Kobayashi Hiroaki

    7 (1) 35-38 2008/08/20

    Publisher: Forum on Information Technology

  38. A programming language extension and its automatic optimization techniques for exploiting the potential of GPUs

    SATO KATUTO, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

    IPSJ SIG Notes 2008 (74) 199-204 2008/07/29

    Publisher: Information Processing Society of Japan (IPSJ)

    ISSN: 0919-6072

    More details Close

    GPUs have a great potencial of high-performance computing and have been used in various applications in addition to graphics processing. In order to achieve high-performance with GPUs, we have to carry out architecture-aware optimizations because of their unique architecture. We have proposed SPRAT, a programming language for hybrid systems of CPUs and CPUs, to realize both the portability of programs and the high computation effeciency. This paper proposes some automatic optimization techniques based on memory access adjustments. The results shows, significant performance improvements in the executions of Edge detection and LU decomposition.

  39. On-Chip Cache Memory Systems for Next Vector Architectures

    7 89-93 2008/07

    Publisher: 東北大学サイバーサイエンスセンター

  40. A Stream Programming Language for GPU Computing

    TAKIZAWA Hiroyuki, SATO Katuto, KOBAYASHI Hiroaki

    Journal of the Visualization Society of Japan 28 (1) 271-274 2008/07/01

    Publisher: 可視化情報学会

    ISSN: 0916-4731

  41. ベクトルプロセッサ用キャッシュメモリの性能評価

    佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    情報処理学会シンポジウム論文集 2008 (2) 55 2008/01/17

    ISSN: 1344-0640

  42. SPRAT : 実行時自動チューニング機能を備えるストリーム処理記述用言語

    滝沢寛之

    先進的計算基盤システムシンポジウム(SACSIS2008) 139-148 2008

  43. I-004 A Parallel Image Generation Algorithm based on Partitioning of Photon Maps

    Tamura Masahide, Takizawa Hiroyuki, Kobayashi Hiroaki

    6 (3) 203-206 2007/08/22

    Publisher: Forum on Information Technology

  44. A Study on Dynamic Task Assignment to CPU and GPU Based on Runtime Performance Prediction

    SHIRATORI Hiroki, TAKIZAWA Hiroyuki, KOBAYASHI Hiroaki

    IEICE technical report 107 (175) 37-42 2007/08/02

    Publisher: The Institute of Electronics, Information and Communication Engineers

    ISSN: 0913-5685

    More details Close

    Recent studies of general-purpose computation on graphics processing units (GPUs) have shown that a PC equipped with high performance CPU and GPU can be regarded as a heterogeneous parallel processing system. On the other hand, programming for such a system has become complicated. In order to exploit the potential of the system, unified programming models for the CPU and GPU have been studied. However, the selection of CPU or GPU that executes a program must be made manually and statically in most of the existing development tools for GPGPU applications. Because appropriate selection depends on some information determined at runtime, the processing efficiency improves if the appropriate processor can be dynamically selected based on the performance prediction at runtime. This paper examines the effectiveness of dynamically selecting the appropriate processor based on the execution time estimation and the the processor switching cost. The experimental results show that the cost of the processor switching except the data transfer is negligible and hence the processor switching can improve the performance if the execution time is long compared to the prediction error.

  45. The Evaluation of A Way-Allocatable Shared Cache Mechanism

    KOTERA ISAO, EGAWA RYUSUKE, TAKIZAWA HIROYUKI, KOBAYASHI HIROAKI

    IPSJ SIG Notes 2007 (79) 31-36 2007/08/01

    Publisher: Information Processing Society of Japan (IPSJ)

    ISSN: 0919-6072

    More details Close

    We have proposed a way-allocatable shared cache mechanism for chip multiprocessors, which can save power consumption with remaining the performance by employing cache partitioning and power gating. In the proposed mechanism, a metric of cache access locality is defined and used for the cache partitioning and the power gating. Based on the metric, the proposed mechanism can flexibly change the configuration to be either performance-oriented or power-oriented. This paper evaluates the validity of the proposed mechanism, using some benchmarks with different cache access behaviors. The evaluation results show that the proposed mechanism can appropriately partition the shared cache for applications with high localities. In addition, our proposal at the performance-oriented mode can reduce energy consumption by 28% while improving the performance by 0.3%.

  46. SC|06調査報告(3.2 第5回情報シナジー研究会, 3. 研究活動報告)

    小野 敏, 滝沢 寛之, 小林 広明

    年報 6 83-87 2007/07

    Publisher: 東北大学情報シナジーセンター

  47. SC|05調査報告(3.2 第4回情報シナジー研究会, 3. 研究活動)

    大泉 健治, 伊藤 英一, 滝沢 寛之, 小林 広明

    年報 5 71-74 2006/06

    Publisher: 東北大学情報シナジーセンター

  48. A Runtime Optimization Method for Redundant Task Dispatch on P2P Computing Platforms.(3.2 第4回情報シナジー研究会, 3. 研究活動)

    Wang Hong, Takizawa Hiroyuki, Kobayashi Hiroaki

    年報 5 100-105 2006/06

    Publisher: 東北大学情報シナジーセンター

  49. 実シミュレーションコードによる大規模科学計算システムの性能評価(3.2 第4回情報シナジー研究会, 3. 研究活動)

    滝沢 寛之, 岡部 公起, 伊藤 英一, 撫佐 昭裕, 曽我 隆, 伊藤 学, 小林 広明

    年報 5 78-83 2006/06

    Publisher: 東北大学情報シナジーセンター

  50. HPCチャレンジでのSXシステムの性能評価(3.2 第3回情報シナジー研究会, 3. 研究活動)

    小林 広明, 滝沢 寛之, 小久保 達信, 岡部 公起, 伊藤 英一, 小林 義昭, 浅見 暁, 小林 一夫, 後藤 記一, 片海 健亮, 深田 大輔

    年報 4 98-116 2005/05

    Publisher: 東北大学情報シナジーセンター

  51. HPC チャレンジでのSX システムの性能評価

    小林広明, 滝沢寛之, 小久保達信, 岡部公起, 伊藤英一, 小林義昭, 浅見暁, 小林一夫, 後藤記一, 片海健亮, 深田大輔

    東北大学情報シナジーセンター大規模科学計算機システム広報SENAC 38 (1) 5-28 2005/01

  52. スーパーSINET を利用した大規模遠隔可視化処理の評価

    滝沢寛之, 小林広明

    東北大学情報シナジーセンター大規模科学計算機システム広報SENAC 37 (2) 5-10 2004/04

  53. Performance Analysis of a Parallel Law-of-the-Jungle Algorithm for Generating Codebooks of Vector Quantization

    MOMOSE Shintaro, SANO Kentaro, TAKIZAWA Hiroyuki, NAKAJIMA Taira, KOBAYASHI Hiroaki, NAKAMURA Tadao

    IEICE technical report. Neurocomputing 103 (92) 25-30 2003/05/22

    Publisher: The Institute of Electronics, Information and Communication Engineers

    ISSN: 0913-5685

    More details Close

    Vector quantization is an attractive technique for lossy data compression, which has been a key technology for efficient data storage andlor transfer. So far, various algorithms have been proposed to design optimal codebooks presenting quantization with minimized errors. In particular, the Law-of-the-Jungle(LOJ) learning algorithm has been proposed to achieve rapid codebook design by algorithmic improvements. However, its acceleration is still required when large data sets are processed on a single computer. In order to achieve faster codebook design, we have been proposed a scalable parallel codebook design algorithm for parallel computers. This paper analyzes and evaluates the performance of the parallel LOJ learning algorithm on three types of parallel computers: an IBM SP2, an NEC AzusA and a PC cluster.

  54. Parallel Codebook Generation for Optimal Vector Quantizer

    MOMOSE Shintaro, SANO Kentaro, TAKIZAWA Hiroyuki, NAKAJIMA Taira, LIMA Clecio Donizete, KOBAYASHI Hiroaki, NAKAMURA Tadao

    IPSJ SIG Notes 2002 (80) 67-72 2002/08/21

    Publisher: Information Processing Society of Japan (IPSJ)

    ISSN: 0919-6072

    More details Close

    Vector quantization is an attractive technique for lossy data compression, which has been a key technology for data storage and/or transfer. So far, various algorithms have been proposed to design optimal codebooks presenting quantization with minimized errors. In particular, the Law-of-the-Jungle(LOJ) learning algorithm has been proposed to achieve rapid codebook design by algorithmic improvements. However, its acceleration is still required when large data sets are processed on a single computer. Therefore, a scalable parallel codebook design algorithm for parallel computers is required. This paper presents a parallel algorithm for the LOJ learning, suitable for distributed-memory parallel computers with a message-passing mechanism. Experimental results indicate a high scalability of the, proposed parallel algdrithm on the IBM SP2 parallel com'puter with 32 processing elements.

  55. ベクトル量子化のための並列コードブック生成アルゴリズムの性能評価(2.<特集>第1回情報シナジー研究会)

    百瀬 真太郎, 佐野 健太郎, 滝沢 寛之, 中島 平, 小林 広明, 中村 維男, Clecio Donizete Lima, 東北大学大学院情報科学研究科, 東北大学大学院情報科学研究科, 東北大学情報シナジーセンター, 東北大学大学院工学研究科, 東北大学大学院情報科学研究科, 東北大学情報シナジーセンター, 東北大学大学院情報科学研究科

    年報 2 33-42 2002/07/01

    More details Close

    ベクトル量子化は高効率なデータ圧縮手法であり、データの保存や転送において核となる技術である。これまでに、誤差の少ない量子化のための最適コードブックを生成する様々な手法が提案されており、中でもアルゴリズムの改良によってコードブック生成処理時間の短縮を図るLaw-of-the-Jungle(LOJ)アルゴリズムが注目を集めている。しかし、大きなデータセットを単一のCPUで処理する場合、アルゴリズムの改良による処理時間短縮には限界があり、並列処理によるさらなる速度向上が求められている。本論文では、メモリ分散型並列計算機に適した並列LOJアルゴリズムを提案する。IBM SP2、NEC AzusA、PCクラスタを用いて並列LOJアルゴリズムの性能評価を行なった結果、いずれもプロセッサ台数に対する高い速度向上率が得られた。

  56. 新潟大学総合情報処理センターコンピュータシステムの更新

    滝沢寛之

    新潟大学総合情報処理センター年報NIICE (13) 21-27 2002/03

  57. PC-UNIX 導入時の不正アクセス対策

    滝沢寛之

    新潟大学総合情報処理センター年報NIICE (12) 13-19 2001/03

Show all ︎Show first 5

Books and Other Publications 15

  1. Sustained Simulation Performance 2022

    Michael M. Resch, Johannes Geber, Hiroaki Kobayashi, Hiroyuki Takizawa, Wolfgang Bez

    Springer Cham 2024/03

    ISBN: 9783031410727

  2. VLSI Design and Test for Systems Dependability

    Hiroyuki Takizawa, Ye Gao, Masayuki Sato, Ryusuke Egawa, Hiroaki Kobayashi

    Springer Japan 2019/01

  3. Advanced Software Technologies for Post-Peta Scale Computing

    Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

    Springer 2018/12

  4. Sustained Simulation Performance 2016

    Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, Reiji Suda

    Springer-Verlang 2016

  5. コンピュータ工学入門

    鏡慎吾, 佐野健太郎, 滝沢寛之, 岡谷貴之, 小林広明

    コロナ社 2015/04

  6. Sustained Simulation Performance 2015

    Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

    Springer-Verlang 2015

  7. Sustained Simulation Performance 2014

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Springer-Verlang 2014

  8. High Performance Computing on Vector Systems 2012

    Hiroyuki Takizawa, Ryusuke Egawa, Daisuke Takahashi, Reiji Suda

    Springer-Verlang 2012

  9. High Performance Computing on Vector Systems, 2012

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Springer-Verlang 2012

  10. Software Automatic Tuning: From Concepts to State-of-the-Art Results

    Katsuto Sato, Hiroyuki Takizawa, Kazuhiko Komatsu, Hiroaki Kobayashi

    Springer-Verlang 2010

  11. High Performance Computing on Vector Systems 2009

    Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiro Musa, Takashi Soga, Yoko Isobe

    Springer-Verlang 2009

  12. High Performance Computing on Vector Systems 2007

    Hiroaki Kobayashi, Akihiro Musa, Yoshiei Sato, Hiroyuki Takizawa, Koki Okabe

    Springer-Verlang 2008

  13. High Performance Computing on Vector Systems 2008

    Hiroaki Kobayashi, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Akihiro Musa, Takashi Soga, Yoichi Shimomura

    Springer-Verlang 2008

  14. New text for operating information processing devices

    M. Yamamoto, H. Takziawa

    2003/04

  15. Text for operating information processing devices

    I. Yamasaki, M. Hasegawa, H. Takziawa

    2000/04

Show all Show first 5

Presentations 129

  1. ワークフローエンジンとの連携に基づく臨機応変なジョブスケジューリングの実現

    滝沢寛之

    第16回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2024) 2024/12/26

  2. スパコンAOBA-Sの性能評価と将来計画 Invited

    滝沢寛之

    太陽地球環境シミュレーション研究会 2024/12/24

  3. New Strategies at Tohoku University Cyberscience Center

    Hiroyuki Takizawa

    38th Workshop on Sustained Simulation Performance 2024/12/12

  4. ExpressHPC: towards "connected supercomputing" enabling on-demand job execution for disaster resilience.

    Hiroyuki Takizawa, Tatsuyoshi Ohmura, Keichi Takahashi, Yoichi Shimomura, Ryusuke Egawa, Yoshihiko Sato, Junko Yoshino, Akihiro Musa, Shunichi Koshimura

    4th Combined Workshop on Interactive and Urgent High-Performance Computing (WIUHPC) 2024/11/18

  5. Realizing Connected Supercomputing with dynamic and adaptive resource management Invited

    Hiroyuki Takizawa

    SC24 Nagoya University Booth Presentation 2024/11/18

  6. 10年後の情報基盤センターは地球と人類にいかに貢献するか? Invited

    滝沢寛之

    第50回ASE研究会 2024/11/08

  7. Connected Supercomputing with on-demand job execution for disaster mitigation and more… Invited

    Hiroyuki Takizawa

    Reality in Science, Art, and Humanities – paradigms of its media conditions 2024/10/21

  8. Operational experience of the latest-generation SX-Aurora TSUBASA system, AOBA-S Invited

    Hiroyuki Takizawa

    37th Workshop on Sustained Simulation Performance 2024/06/17

  9. Introduction of AOBA-S: The world’s largest SX-Aurora TSUBASA system operating at Tohoku University Invited

    Hiroyuki Takizawa

    NUG Society Meeting 35 2024/06/14

  10. ML-based Autotuning of Quantum Annealing Schedule Invited

    Hiroyuki Takizawa, Michael Zielewski, Keichi Takahashi, Yoichi Shimomura

    Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2024) 2024/03/22

  11. スパコンAOBAの運用開始と将来展望 Invited

    滝沢寛之

    Supercomputing JAPAN! 2024 2024/03/12

  12. Automatic Parameter Tuning for Efficient Checkpointing International-presentation Invited

    Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

    28th Workshop on Sustained Simulation Performance 2018/10/10

  13. Initial Evaluation of Basic Performance and Functionality of Aurora Invited

    TAKIZAWA Hiroyuki

    SX-Aurora TSUBASA Forum 2018/07/27

  14. Automatic Parameter Tuning of Application-Level Incremental Checkpointing International-presentation Invited

    Hiroyuki Takizawa, Muhammad Alfian Amrizal, Kazuhiko Komatsu, Ryusuke Egawa

    2018 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing 2018/03/27

  15. Towards prediction of effective optimizations in performance engineering International-presentation

    Hiroyuki Takizawa, Yuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa

    27th Workshop on Sustained Simulation Performance 2018/03/22

  16. Make full use of supercomputers! -- Importance and challenges for efficient use of supercomputers -- Invited

    TAKIZAWA Hiroyuki

    2018/03/16

  17. User-Defined Code Transformation for Separation of Performance-Awareness from Application Codes International-presentation

    Hiroyuki Takizawa

    SIAM conference on parallel processing for scientific computing (mini-simposium) 2018/03/09

  18. Auto-tuning of Hyperparameters of Machine Learning Models International-presentation

    Zhen Wang, Ryusuke Egawa, Reiji Suda, Hiroyuki Takizawa

    HPC Asia 2018 2018/01/29

  19. Thermal-aware Dynamic Checkpoint Interval Tuning for High Performance Computing International-presentation

    Pei Li, Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

    HPC Asia 2018 2018/01/29

  20. A User-defined Code Transformation Approach to Separation of Performance Concerns International-presentation

    Hiroyuki Takizawa

    First Workshop on Software Challenges to Exascale Computing 2017/12/17

  21. 大規模科学計算システムにおける利用者プログラムの特性分析

    大泉健治, 山下毅, 穂苅寛光, 江川隆輔, 滝沢寛之, 小林広明

    大学ICT推進協議会 2017年度 年次大会 (AXIES2017) 2017/12/13

  22. 反応・相変化を伴う多分散系混相流シミュレーションコードの最適化

    佐々木大輔, 加藤季広, 磯部洋子, 笠原弘貴, 渡部広吾輝, 志村啓, 奥野航平, 松尾亜紀子, 江川隆輔, 滝沢寛之, 小林広明

    大学ICT推進協議会 2017年度 年次大会 (AXIES2017) 2017/12/13

  23. Expressing performance-awareness as user-defined code transformations International-presentation

    Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa, Fumihiko Ino

    International Symposium on Post Petascale System Software 2017/12/11

  24. An Evolutionary Approach to Construction of a Software Development Environment for Massively-Parallel Heterogeneous Systems International-presentation

    Hiroyuki Takizawa, Reiji Suda, Daisuke Takahashi, Ryusuke Egawa

    Internationl Symposium on Post Petascale System Software 2017/12/11

  25. Performance Engineering with User-defined Code Transformations International-presentation

    Hiroyuki Takizawa

    Joint Workshop on High-Performance Computing with NSCC-Wuxi and Tohoku University 2017/09/21

  26. ExaFSA - Exascale Simulation of Fluid-Structure-Acoustics Interactions International-presentation

    Florian Lindner, Miriam Mehl, Thorsten Reimann, Sabine Roller, Dörte C. Sternel, Hiroyuki Takizawa, Sander van Zujilen

    ISC High Performance 2017 2017/07/18

  27. Xevolverプロジェクト -- 計算科学と計算機科学をつなぐ架け橋を目指して --

    滝沢寛之

    高度情報科学技術研究機構 平成28年度 高速化ワークショップ 2017/03/24

  28. Performance Tuning with Machine Learning International-presentation

    Hiroyuki Takizawa, Cui Hang, Shoichi Hirasawa

    The 25th Workshop on Sustained Simulation Performance 2017/03/13

  29. Combining Code Transformations and Autotuning International-presentation

    Hiroyuki Takizawa

    2017 Advanced Topics and Auto-Tuning in High-Performance Scientific Computing 2017 2017/03/11

  30. User-Defined Directive Translation for Automatic Tuning International-presentation

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    2017 Advanced Topics and Auto-Tuning in High-Performance Scientific Computing 2017 2017/03/11

  31. User-Defined Directive Translation Using the Xevolver Framework International-presentation

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    SIAM Computational Science and Engineering 2017/03/02

  32. 進化的アプローチによる超並列複合システム向け開発環境の創出

    滝沢寛之

    第8回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2016) 2016/12/26

  33. Xevolverプロジェクトの概要

    滝沢寛之

    ポストペタワークショップ 2016/12/14

  34. Autotuning meets Code Transformations International-presentation

    Hiroyuki Takizawa

    24th Workshop on Sustained Simulation Performance 2016/12/05

  35. Making a Legacy Code Auto-Tunable without Messing It Up International-presentation

    Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC16) 2016/11/13

  36. User-Defined Code Transformation for High Performance Portability International-presentation

    Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

    SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP16) 2016/04/12

  37. Performance Engineering of HPC Applications Based on Pattern Matching International-presentation

    Hiroyuki TAKIZAWA, Takeshi YAMADA, Takuya TSUNOGAWA, Shoichi HIRASAWA, Hiroaki KOBAYASHI

    23rd Workshop on Sustained Simulation Performance 2016/03/16

  38. Data layout optimization using user-defined code transformations International-presentation

    Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, Hiroaki Kobayashi

    2016 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing 2016/02/19

  39. A Code Transformation Approach to Achieving High Performance Portability International-presentation

    Hiroyuki TAKIZAWA, Daisuke TAKAHASHI, Reiji SUDA, Ryusuke EGAWA

    SPPEXA Annual Plenary Meeting 2016 2016/01/25

  40. 進化的アプローチによる 超並列複合システム向け開発環境の創出

    滝沢 寛之, 高橋大介, 須田礼仁, 江川隆輔

    第7回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2015) 2015/12/25

  41. Xevtgen: automatic generation of code transformation rules based on before-and-after codes International-presentation

    Hiroyuki Takizawa, Shoichi Hirasawa, Reiji Suda

    22nd Workshop on Sustained Simulation Performance 2015/12/17

  42. The Xevolver Project: Separation of Concerns for Supporting Legacy Application Migration

    Hiroyuki Takizawa

    ATRG Open Academic Session 2015/12/11

  43. 機械工学分野における シミュレーション科学の新展開

    滝沢寛之

    学際大規模情報基盤共同利用・共同研究拠点 第7回シンポジウム 2015/07/09

  44. Framework for Separation of Concerns Between Application Requirements and System Requirements International-presentation

    Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

    SIAM Conference on Computational Science & Engineering 2015 2015/03/16

  45. Auto-Tuning with User-Defined Code Transformations International-presentation

    Hiroyuki Takizawa

    2015 Conerence on Advanced Topics and Auto-Tuning in High-Performance Scientific Computing 2015/02/26

  46. What can we do to fight with system diversity? International-presentation

    Hiroyuki Takizawa

    21st Workshop on Sustained Simulation Performance 2015/02/18

  47. 進化的アプローチによる 超並列複合システム向け開発環境の創出

    滝沢寛之, 須田礼仁, 高橋大介, 江川隆輔

    第6回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2014) 2014/12/25

  48. Xevolver: an extensible framework for user-defined code transformation International-presentation

    Hiroyuki Takizawa

    20th Workshop on Sustained Simulation Performance 2014/12/15

  49. Xevolver Project International-presentation

    Hiroyuki Takizawa, Daisuke Takahashi, Reiji Suda, Ryusuke Egawa

    International Symposium on Post Petascale System Software (ISP2S2) 2014 2014/12/02

  50. Xevolver Project International-presentation

    Hiroyuki Takizawa, Daisuke Takahashi, Reiji Suda, Ryusuke Egawa

    Asian Technology Information Program (ATIP) Workshop at SC14 2014/11/17

  51. 機械工学分野における シミュレーション科学の新展開

    滝沢寛之

    学際大規模情報基盤共同利用・共同研究拠点 第6回シンポジウム 2014/07/11

  52. Evolutionary Adaptation of HPC Applications to Revolutionary System Changes International-presentation

    Hiroyuki Takizawa

    International Supercomputing Conference (ISC) 2014 2014/06/22

  53. Xevolver: an extensible programming framework for cusom code transformation International-presentation

    Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

    2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing 2014/03/15

  54. はやいぃスパコンは作れる!?

    滝沢寛之

    JACORN2013 Winter - 次世代 RHW 創造研究会 2013/12/26

  55. 進化的アプローチによる 超並列複合システム向け開発環境の創出

    滝沢寛之, 須田礼仁, 高橋大介, 江川隆輔

    第5回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2013) 2013/12/25

  56. An XML-based Programming Framework for User-defined Code Transformations International-presentation

    Hiroyuki Takizawa, Xiong Xiao, Shoichi Hirasawa, Hiroaki Kobayashi

    4th AICS International Symposium 2013/12/02

  57. Xevolver : an XML-based Programming Framework for Software Evolution International-presentation

    Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC13) 2013/11/17

  58. XMLを用いたツール間連携に向けて

    滝沢寛之

    1st XcalableMP Workshop 2013/11/01

  59. Xevolver: towards an extensible programming environment for software evolution International-presentation

    Hiroyuki Takizawa

    International Symposium on Embedded Multicore/Many-core Systems-on-Chip 2013/09/26

  60. OpenACCにおける性能チューニングとその効果

    滝沢寛之, 平澤将一, 小松一彦, 小林広明

    日本応用数理学会年会 2013/09/09

  61. A Case Study of Performance Tuning with the POET Framework

    肖 熊, 平澤将一, 滝沢寛之, 小林広明

    電気関係学会東北支部連合大会 2013/08/23

  62. Code Refactoring for High Performance Computing Applications

    Chunyan Wang, 平澤将一, 滝沢寛之, 小林広明

    電気関係学会東北支部連合大会 2013/08/23

  63. ブロックバイパス機構によるキャッシュのエネルギ効率化に関する研究

    高井拓実, 佐藤雅之, 江川隆輔, 滝沢寛之, 小林広明

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2013/07/31

  64. マルチプラットフォームにおける最適化手法の効果に関する一検討

    小松 一彦, 佐々木 俊英, 江川 隆輔, 滝沢 寛之, 小林 広明

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2013/07/31

  65. Autotuning for Improving the Fault Tolerance of Large-scale Simulations International-presentation

    Hiroyuki Takizawa, Alfian Amrizal, Shoichi Hirasawa, Hiroaki Kobayashi

    Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing 2013/03/27

  66. ソフトウェア進化のための自動性能追跡システム

    平澤将一, 滝沢寛之, 小林広明

    ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2013) 2013/01/15

  67. プログラム自動生成技術に基づくGPUコンピューティングの性能評価

    菅原 誠, 佐藤 功人, 小松 一彦, 滝沢 寛之, 小林 広明

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2011/07/27

  68. マイグレーションによる複合型計算システム向けジョブスケジューリング

    小山賢太郎, 佐藤功人, 小松一彦, 村田善智, 滝沢寛之, 小林広明

    先進的計算基盤システムシンポジウム(SACSIS2011) 2011/05/25

  69. ルーフラインモデルに基づくベクトルプロセッサ向けプログラム最適化戦略

    佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2011) 2011/01/18

  70. 実アプリケーションを用いたチップマルチベクトルプロセッサの消費エネルギ評価

    永岡龍一, 佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 小林広明

    ハイパフォーマンスコンピューティングとアーキテクチャの評価に関する北海道ワークショップ(HOKKE-18) 2010/12/16

  71. Cache Partitioning Strategies for 3-D Stacked Vector Processors International-presentation

    Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE 3D System Integration Conference 2010 2010/11/16

  72. A Performance Tuning Strategy under Combining Loop Transforms for a Vector Processor with an On-Chip Cache International-presentation

    Yoshiei Sato, Ryuichi Nagaoka, Akihiro Musa, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    ACM/IEEE Supercomputing Conference (SC10) 2010/11/13

  73. 複合型計算システムにおける実行時自動チューニング

    滝沢寛之

    自動チューニング技術の現状と応用に関するシンポジウム 2010/11

  74. A Runtime Task Reallocation Library for Heterogeneous Computational Environments International-presentation

    Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, Hiroaki Kobayashi

    7th International Conference on Fluid Dynamics 2010/11/01

  75. A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications International-presentation

    Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Euromicro Conference on Digital System Design 2010/09/01

  76. An Out-of-order Vector Processing Mechanism for Multimedia Applications

    Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2010/08/03

  77. Efficient Data Management for the Building Cube Method using Cartesian Meshes on the GPU Platform International-presentation

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    International Supercomputing Conference (ISC10) 2010/05/30

  78. Parallel Processing of the Building-Cube Method on the GPU Platform International-presentation

    Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    22nd International Conference on Parallel Computational Fluid Dynamics 2010/05/17

  79. Performance of SOR Methods on Vector Processor SX-9 International-presentation

    Takashi Soga, Akihiro Musa, Koki Okabe, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi

    22nd International Conference on Parallel Computational Fluid Dynamics 2010/05/17

  80. ハイブリッド型計算環境のためのプログラミングフレームワークSPRAT

    小松 一彦, 小山 賢太郎, 佐藤 功人, 滝沢 寛之, 小林 広明

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2010/03

  81. A High-level Programming Framework for Efficient Hybrid-architecture Computing International-presentation

    Kazuhiko Komatsu, Kentaro Koyama, Katsuto Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

    14th SIAM Conference on Parallel Processing for Scientific Computing Minisymposium 2010/02/24

  82. OpenCL によるGPUコンピューティングの性能評価

    荒井勇亮, 佐藤功人, 滝沢寛之, 小林広明

    情報処理学会HPC研究会 2010/02/22

  83. GPUを手軽にちゃんと使える環境の実現に向けて

    東京工業大学計算世界観GCOEセミナー 2009/12/09

  84. A High-level GPU Programming Framework for Fluid Dynamics Simulation International-presentation

    Katsuto Sato, Hiroyuki Takizawa, Hiroaki Kobayashi

    6th International Conference on Fluid Dynamics 2009/11/04

  85. 新アーキテキチャへのアプローチ

    自動チューニング技術の現状と応用に関するシンポジウム 2009/10/22

  86. CUDAアプリケーション向けチェックポイント・リスタート機能の実装と評価

    滝沢寛之, 佐藤功人, 小松一彦, 小林広明

    情報処理学会HPC研究会 2009/10/09

  87. 実アプリケーションによるチップマルチベクトルプロセッサの性能評価

    佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    次世代スーパコンピューティングコンシンポジウム 2009/10/07

  88. 三次元積層技術による次世代ベクトルキャッシュの設計と評価

    船矢祐介, 永岡龍一, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    次世代スーパコンピューティングコンシンポジウム 2009/10/07

  89. 3D On-Chip Memory for the Vector Architecture International-presentation

    Yusuke Funaya, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    IEEE 3D System Integration Conference 2009 2009/09/28

  90. Cellによる高性能計算の可能性を探る

    日本機械学会2009年度年次大会 2009/09/15

  91. Working Sets based Thread Scheduling with Cache Partitioning International-presentation

    Masayuki Sato, Isao Kotera, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    Parallel Architecture and Compilation Techniques (PACT) 2009/09/12

  92. 次世代プログラミング環境 ~多様なプロセッサを使いこなす~

    FIT2009 2009/09/03

  93. An Auction based Resource Allocation Considering Multifaceted Utilies in a Peer-to-Peer Environment

    Chaianan Satayapiwat, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    FIT2009 2009/09/02

  94. ボランティアコンピューティングの高効率化ためのクライアントレベルスケジューリング

    村田善智, 遠藤聡明, 滝沢寛之, 小林広明

    FIT2009 2009/09/02

  95. プロセッサ自動選択機能を有するBLASの実現に向けた性能評価

    小松一彦, 小山賢太郎, 佐藤功人, 滝沢寛之, 小林広明

    FIT2009 2009/09/02

  96. キャッシュメモリを有するベクトルプロセッサのためのプログラム最適化手法

    佐藤義永, 永岡龍一, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2009/08/04

  97. ワーキングセット評価に基づくスレッドスケジューリング

    佐藤雅之, 小寺功, 江川隆輔, 滝沢寛之, 小林広明

    並列/協調/分散処理に関するサマーワークショップ(SWoPP) 2009/08/04

  98. メモリ積層型3次元ベクトルプロセッサの評価

    船矢祐介, 江川隆輔, 滝沢寛之, 小林広明

    先端的計算基盤システムシンポジウム(SACSIS 2009) 2009/06/28

  99. CPUとGPUを協調利用するソフトウェア開発環境

    佐藤功人, 滝沢寛之, 小林広明

    筑波大学計算科学研究センターGPGPU講習会/研究会 2009/06/24

  100. Hiding Programming Complexity for GPU Computing

    Suda laboratory , GPGPU sperial seminar 2009/06/11

  101. ストリーム処理記述言語のGPU向け自動最適化の検討

    佐藤功人, 滝沢寛之, 小林広明

    先端的計算基盤システムシンポジウム(SACSIS 2009) 2009/05/28

  102. Early Evaluation of a Memory-Stacked Vector Processor International-presentation

    Yusuke Funaya, RyusukeEgawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    COOL Chips XII 2009/04/15

  103. GPU向け線形代数ライブラリの性能評価

    小山賢太郎, 佐藤功人, 小松一彦, 滝沢寛之, 小林広明

    計算工学講演会 2009/04/13

    More details Close

    計算工学講演会論文集 Vol.14, no.1, pp.289—292, 2009

  104. SX-9による大規模並列シミュレーション

    曽我 隆, 下村 陽一, 撫佐 昭裕, 江川 隆輔, 滝沢 寛之, 岡部 公起, 小林 広明, 高橋俊, 中橋和博

    シナジー研究会 2009/02/13

  105. 実アプリケーションによるSX-9の性能評価

    曽我 隆, 下村 陽一, 撫佐 昭裕, 江川 隆輔, 滝沢 寛之, 岡部 公起, 小林 広明

    ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2009) 2009/01/12

  106. Caching on a Chip Multi Vector Processor International-presentation

    Akihiro Musa, Yoshiei Sato, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    SC08 2008/11/15

  107. ベクトルプロセッサ用キャッ シュメモリにおけるMSHR の性能評価

    佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    次世代スーパーコンピューティング・シンポジウム2008 2008/09/16

  108. ウェイアロケーション型共有キャッシュ機構のハードウェア設計に関する研究

    第7 回情報科学技術フォーラム(FIT2008) 2008/09/02

  109. GPU を効率的に利用するための言語拡張と自動最適化手法

    佐藤功人, 滝沢寛之, 小林広明

    並 列/協調/分散処理に関するサマーワークショップ(SWoPP2008) 2008/08/05

  110. GPU コンピューティングのためのストリーム処理記述言語

    第36 回可視化情報シンポジウム 2008/07/22

  111. SPRAT: 実行時自動チューニング機能を備えるスト リーム処理記述用言語

    滝沢寛之, 白取寛貴, 佐藤功人, 小林広明

    情報処理学会先進的計算基盤システムシンポジウム(SACSIS2008) 2008/06/11

  112. 分散協調型スケジューラを用いた大規模計算環境上での負荷分 散手法の紹介

    村田善智, 滝沢寛之, 小林広明

    第2回InTrigger Community Workshop 2008/06/04

  113. Auction-based Resource Allocation for activating incentives in resource trading in Grid Computing

    Chainan Satayapiwat, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2008/03/13

  114. Preliminary evaluation of a result checking mechanism for reliable volunteer computing

    Ling Xu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2008/03/13

  115. A Fast Ray Frustum-Triangle Intersection Algorithm with Precomputation and Early Termination

    Kazuhiko Komatsu, Yoshiyuki Kaeriyama, Kenichi Suzuki, Hiroyuki Takizawa, Hiroaki Kobayashi

    2008 年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS2008) 2008/01/17

  116. ベクトルプロセッサ用キャッ シュメモリの性能評価

    佐藤義永, 撫佐昭裕, 江川隆輔, 滝沢寛之, 岡部公起, 小林広明

    2008 年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS2008) 2008/01/17

  117. Early Evaluation of On-Chip Vector Caching for the NEC SX Vector ArchitectureEarly Evaluation of On-Chip Vector Caching for the NEC SX Vector Architecture International-presentation

    Akihiro Musa, Yoshiei Sato, Ryusuke Egawa, Hiroyuki Takizawa, Koki Okabe, Hiroaki Kobayashi

    SC07 2007/11/14

  118. Preliminary Evaluation for Runtime Auto-tuning of GPGPU Applications International-presentation

    Hiroyuki Takizawa, Hiroki Shiratori, Hiroaki Kobayashi

    The Second international Workshop on Automatic Performance Tuning 2007/09/20

  119. フォトンマップ分割に基づく並列画像生成アルゴリズム

    田村壮秀, 滝沢寛之, 小林広明

    第6回情報科学技術フォーラム 2007/09/05

  120. 実行時性能予測に基づくCPUとGPUへの動的タスク割当の検討

    白取寛貴, 滝沢寛之, 小林広明

    並列/分散/協調処理に関するサマー・ワークショップ 2007/08/01

  121. ウェイアロケーション型共有キャッシュ機構の性能評価

    小寺功, 滝沢寛之, 小林広明

    並列/分散/協調処理に関するサマー・ワークショップ 2007/08/01

  122. 遊休計算資源を用いたパラメータスイープ型並列計算におけるタスクスケジューラの性能評価

    村田善智, 小田川雅人, 滝沢寛之, 小林広明

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2007/03

  123. PS3を用いた分散コンピューティング環境の開発と評価

    小田川雅人, 吉田向志, 村田善智, 滝沢寛之, 小林広明

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2007/03

  124. ゲームユーザーのユビキタスコンピューティングプラットフォームへの参加を促すインセンティブモデルの検討

    中田武男, 大庭信之, 滝沢寛之, 小林広明

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2007/03

  125. 描画用ハードウェアの活用によるふく射伝熱の対話的シミュレーションと可視化

    滝沢寛之, 山田昇, 酒井清吾, 小林広明

    第一回日本ヒートアイランド学会全国大会 2006/07/27

  126. Performance Evaluation of SX-7 Using Real Simulation Codes

    Hiroyuki Takizawa, Akihiro Musa, Takashi Soga, Yoshiaki Matsumura, Manabu Ito, Hiroaki Kobayashi

    ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2006) 2006/01

  127. A Distributed and Cooperative Load Balancing Mechanism for Large-scale P2P Systems

    Yoshitomo Murata, Tsutomu Inaba, Hiroyuki Takizawa, Hiroaki Kobayashi

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2005/10

  128. P2Pコンピューティングのための分散協調スケジューリング機構

    村田善智, 稲葉勉, 滝沢寛之, 小林広明

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2005/01

  129. A P2P Semantic Information Searching Mechanism for Ubiquitous Grid Computing Systems

    Tsutomu Inaba, Takuro Ohkawa, Yoshitomo Murata, Hiroyuki Takizawa, Hiroaki Kobayashi

    先端的ネットワーク&コンピューティングテクノロジワークショップ 2005/01

Show all Show first 5

Research Projects 35

  1. Programming Environments for High-performance Computing Competitive

    System: JST Basic Research Programs (Core Research for Evolutional Science and Technology :CREST)

    2011/10 - Present

  2. High-performance low-power processor Competitive

    System: Grant-in-Aid for Scientific Research

    2003/03 - Present

  3. 線状降水帯の気象場変化に対する応答の解明: WRFアンサンブル計算を用いて

    平賀優介, 滝沢寛之

    Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点(JHPCN)

    System: 公募型共同研究課題

    2024/04 - 2025/03

  4. Development of system reliability improvement technology based on medium- to long-term failure prediction

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tokyo Denki University

    2021/04/01 - 2024/03/31

  5. Creation of Scalable Computers and their System Software for Post-Moore Era

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (A)

    Institution: Institute of Physical and Chemical Research

    2020/04/01 - 2024/03/31

  6. Expanding Industrial Use of Innovative Technology for Transportation Equipment Design Using Microdevices Through Large-Scale Simulation

    Offer Organization: Tohoku University Cyber Science Center

    System: JHPCN:Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures

    Institution: Tohoku University

    2017 - 2024

  7. Research, Development, and Application of Real Particle Simulations for Plasma Interdisciplinary Science

    Hiroaki Ohtani, Shunsuke Usami, Hiroki Hasegawa, Toseo Moritaka, Masanori Nunami, Mieko Toida, Hideaki Miura, Seiji Ishiguro, Ritoku Horiuchi, Nobuaki Ohno, Shintaro Kawahara, Hideyuki Usui, Yohei Miyake, Mitsue Den, Tomoya Ogawa, Keiichiro Fukazawa, Takahiro Katagiri, Hiroyuki Takizawa

    Offer Organization: Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN)

    System: Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures: Joint Research Projects (General Joint Research Projects)

    2023/04 -

  8. 日本全土の洪水氾濫被害と適応策の検討

    峠 嘉哉, 滝沢 寛之, 風間 聡, 山本 道, 柳原 駿太, 池本 敦哉, 岡本 彩果

    Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点

    System: 公募型共同研究

    Category: 一般共同研究課題

    Institution: 東北大学

    2022/04 - 2023/03

  9. 日本全土の洪水氾濫被害の将来展望

    風間 聡, 滝沢 寛之, 峠 嘉哉, 柳原 駿太

    Offer Organization: 学際大規模情報基盤共同利用・共同研究拠点

    System: 公募型共同研究

    Category: 一般共同研究課題

    Institution: 東北大学

    2021/04 - 2022/03

  10. Creation of non-Neumann FPGA Overlay Architecture for Innovating HPC

    Sano Kentaro

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    2017/04/01 - 2020/03/31

    More details Close

    We have developed fundamental technologies of non-Neumann overlay architecture to exploit FPGAs, which are circuit reconfigurable semiconductor devices, in order to achieve next-generation HPC systems instead of Neumann architectures which are slowing down in performance improvement. With a prototype of FPGA cluster, we have constructed its hardware and software framework, and developed a high-level synthesis compiler for computing problems to be implemented as data-flow circuits on FPGAs. We showed that a pipelining method can increase performance of several computing problems according to the number of FPGAs. This demonstrates that relatively low-power FPGAs can achieve high-performance and scalable computing.

  11. Supporting performance-aware programming with machine learning techniques

    Hiroyuki Takizawa, Kobayashi Hiroaki, Suda Reiji, Okatani Takayuki, Egawa Ryusuke, Ohshima Satoshi

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2016/04/01 - 2019/03/31

    More details Close

    This work has demonstrated some case studies of effectively using machine learning techniques for supporting High-Performance Computing (HPC) programming. Various problems in code optimization can be solved by converting the problems to the problems that have already been proven to be solved by machine learning. Moreover, this work clarified the importance of analyzing the target problems in advance of machine learning, because it is unlikely that a sufficient number of training data are available in code optimization problems. Moreover, as well as HPC programming, machine learning also needs knowledge and experiences of human experts. However, in machine learning, the problem is already parameterized, and hence can be solved if sufficiently-high performance is available.

  12. Research on Software Autotuning Mechanism that evolves to unknown computing environments

    SUDA Reiji, YASUGI Masahiro, KATAGIRI Takahiro

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: The University of Tokyo

    2015/04/01 - 2018/03/31

    More details Close

    Autotuning is a technology aiming to attain good execution performance on various computational environments by preparing variabilities within software and letting the software itself control the variabilities. In this research, we aimed to develop methodology to infuse variabilities and control mechanism which are unintended or even unknown to existing codes, to attain autotuning even if novel computational environments and novel variabilities become newly known. We have shown that, by using Xevolver, which is developed by our team members, we can infuse variabilities and autotuning mechanisms which is unknown to the original code. However, it became clear that we need to fully analyze the original code before applying such infusions.

  13. Design Space Exploration of Future Microprocessors using the post CMOS devices

    EGAWA Ryusuke, Kobayashi Hiroaki, Takizawa Hiroyuki, Tada Jubee, Sato Masayuki, Uno Wataru, Toyoshima Takuya, Sakai Zentaro, Ogasawara Daisuke

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: Tohoku University

    2015/04/01 - 2018/03/31

    More details Close

    In this research, for realizing a high energy efficiency microprocessor using novel device technologies in the post-Moore's era, expected to be practical around 2025, we have worked on circuits and memory subsystems designs. Regarding the circuit design, we worked on the design method of wave-pipelined circuits using CNFET. For the memory subsystem, we focus on a die stacking and STT-RAM technologies. We have examined the cache-bypass mechanism, the energy efficient data allocation method for the multi-bank memory, and the power-aware controlling mechanism for STT-RAM last-level caches.

  14. A Green Microarchitecure in 5.5D-Design Era

    EGAWA RYUSUKE, Kobayashi Hiroaki, Takizawa Hiroyuki, Sato Masayuki, Uno Wataru, Nishimura Shin, Hosokawa Mikio, Toyoshima Takuya

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2014/04/01 - 2017/03/31

    More details Close

    To clarify the design space of future microprocessors after the end of moor’s law, this research project focuses on vertical integration technologies such as 2.5D and 3D technologies using a through silicon via (TSV). Since the TSVs have a high potential of shortening the latency and reducing the power consumption in/of microprocessors and computing systems, these technologies are expected to overcome the limits of technology scaling. In this research, we explore the design space of the future microprocessors by aggressively using TSVs in various stacking granularities. The evaluation results show that appropriate usage of TSVs with considering a trade-off among performance, power, and cost can drastically improve the energy efficiency of the microprocessors and computer systems.

  15. Checkpoint restart technologies for hierarchcal storages

    Hiroyuki Takizawa, Uno Atsuya, Kobayashi Hiroaki, Egawa Ryusuke, Sato Yukinori

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: Tohoku University

    2014/04/01 - 2016/03/31

    More details Close

    Assuming that the state of an application is periodically saved during its execution, we have considered an automatic tuning method for the frequency of saving the state to a hierarchical storage system, and also have discussed a way for reducing the time for writing the state to the storage. A promising approach to the reduction is to speculatively write data that will be written in the future at a high probability. Hence, one technical issue is how to predict such data. For the prediction, we need to analyze memory access patterns of the target application. Hence, we have developed a performance analysis tool for the purpose. The validity and effectiveness of these proposed methods are evaluated based on job scheduling simulation of a large-scale computing system.

  16. A 3D Processor Architecture Co-Designed with Dependable Processing

    Kobayashi Hiroaki, TAKIZAWA HIROYUKI, EGAWA RYUSUKE

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: Tohoku University

    2014/04/01 - 2016/03/31

    More details Close

    The objective of this study is to establish a novel processor architecture that realize both high performance and high dependability in the execution of a wide variety of applications by using 3D die-stacking technology toward the post-Moore’s era. In particular, we have developed a 3D die-stacking memory subsystem architecture integrated with processor cores and its data management mechanism for highly power-efficient and high-throughput memory hierarchy. In addition, we have also developed on-line checkpoint/restart mechanism by using a 3D die-stacking on-chip memory to increase dependability of the processor. The proposed architecture has been evaluated quantitatively by using a wide variety of applications and its effectiveness and limitation have been clarified and discussed.

  17. Infrastructures for accelerating the synergy effect of software-hardware co-design

    Hiroyuki Takizawa, Kobayashi Hiroaki, Aoki Takafumi, Sano Kentaro, Egawa Ryusuke, Tada Jube, Ito Koichi

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2013/04/01 - 2016/03/31

    More details Close

    Assuming OpenCL as a standard environment for accelerator programming, we have pointed out some missing features for supporting more various accelerator architectures,and proposed OpenCL extensions. Although OpenCL has gradually become to be used for hardware description, OpenCL C is not necessarily appropriate for describing OpenCL kernels. Hence, we have designed and implemented high productivity languages for typical computations in the fields of image processing and high performance computing. In addition, we have proposed an automatic tuning method for performance parameters, which need to be adjusted for individual accelerators. The proposed method has been implemented for evaluating its performance impacts.

  18. A Universal Memory Architecture Based on Device-Architecture Co-Design

    Kobayashi Hiroaki, TAKIZAWA HIROYUKI, EGAWA RYUSUKE

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2013/04/01 - 2016/03/31

    More details Close

    The objective of this study is to establish a smart memory subsystem architecture that can consider memory access behaviors of applications and effectively manage data in the memory hierarchy in terms of performance and power efficiency. In particular, we have developed 1) a low-power/high-bandwidth cache architecture, 2) a cache management policy with an on-line evaluation of the memory request behavior of an application for reducing its working set in the memory hierarchy, 3) a cache partitioning mechanism to protect performance-sensitive shared data for chip multicore processors, 4)a memory address mapping mechanism with the performance/performance optimization by using an online-estimation of memory access behavior.

  19. Application-Aware Highly Hierarchical Memory Architecture

    KOBAYASHI Hiroaki, TAKIZAWA Hiroyuki, EGAWA Ryusuke

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: Tohoku University

    2012/04/01 - 2014/03/31

    More details Close

    The objective of this study is to establish a novel on-chip memory architecture that can provide necessary memory resources to running applications under the consideration of their behaviors and requirements regarding a memory subsystem on a multi-core processor. In this study, we have developed a cache-resource management mechanism to realize energy-efficient high performance execution of multi-threaded applications on a multi-core processor. In cooperation with developed hardware functions of cache resizing and partitioning to reduce cache conflicts and maximize the efficiency of cache utilization, this mechanism can extract the potential of multi-core processors with a low-power consumption.

  20. Study of Next-Generation CFD toward Petaflops Computers

    NAKAHASHI Kazuhiro, YAMAMOTO Satoru, OBAYASHI Shigeru, KOBAYASHI Hiroaki, YAMAMOTO Kazuomi, SASAKI Daisuke, JEONG Shinkyu, TAKIZAWA Hiroyuki, EGAWA Ryusuke, KUROTAKI Takuji, ENOMOTO Shunji, IMAMURA Taro, TAKAHASHI Shun

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (S)

    2009/05/11 - 2014/03/31

    More details Close

    This study was conducted aimed at solving the problems of the current CFD in the use of the aerodynamic designs of aircrafts, such as the physical model dependence of the computational results and the increase of the work load for treating complex geometries. The Building-Cube Method was proposed bearing the further performance improvement of computers in mind, and the various algorithm studies for practical use were conducted. One of the achievements was demonstrated by the world-leading large scale flow computation around a car using the K-computer. It is significant that the proposed CFD approach can treat extremely complicated and incomplete CAD data directly for the simulation. This can be a game-changing technology for aerodynamic design process of aircrafts and automobiles.

  21. Study on a framework for auto generation and optimization of HPC accelerator architectures

    SANO Kentaro, TAKIZAWA Hiroyuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Challenging Exploratory Research

    Institution: Tohoku University

    2011 - 2013

    More details Close

    We have focused on an algorithm domain of the stencil computation and cellular automata computation that is one of the representative high-performance computations, and then studied a framework to automatically generate their acceleration hardware for reconfigurable computation with FPGAs. In this project, we have developed a stencil compiler for an FPGA-based systolic array and a high-level synthesis compiler for FPGA-based stream-computing accelerators. They are significant and fundamental technologies for highly productive reconfigurable high-performance computation with FPGAs.

  22. Technologies for realizing highly-efficient and highly-dependable heterogeneous computing systems in the post Petascale era.

    TAKIZAWA Hiroyuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Young Scientists (B)

    Institution: Tohoku University

    2011 - 2012

    More details Close

    A virtualization technique has been proposed to hide the heterogeneous configuration of different processors, by automatic task allocation considering their strengths and weaknesses. OpenCL has also been applied to programming of large-scale systems of various computing nodes. For high dependability, a transparent checkpoint restart mechanism for OpenCL applications has been developed. This work also investigated the practicality and limitations of OpenACC.

  23. Innovative 3D Design for the New Generation Vector Microarchitecture

    KOBAYASHI Hiroaki, TAKIZAWA Hiroyuki, EGAWA Ryusuke

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2010 - 2012

    More details Close

    This study discusses a new design methodology for a microarchitecture of next-generation, low-power high-performance vector processors by using 3D die-stacking technology. A strategy for mixed design of conventional 2D design and TSV (Through-Silicon-Via)-based 3D design that realizes a good trade-off between them in the all level of on-chip units design has also been proposed. Through the performance evaluation of a prototyped 3D vector processor, the effectiveness of 3D design regarding power consumption and performance has been clarified.

  24. High-perofrmance computing using graphics hardware Competitive

    System: Grant-in-Aid for Scientific Research

    2003/04 - 2011/09

  25. Development of Auto-tuning Specification Language Towards Manycore and Massively Parallel Processing Era

    KATAGIRI Takahiro, IMAMURA Toshiyuki, SUDA Reiji, KURODA Hisayasu, ITOH Shoji, IWASHITA Takeshi, TAKIZAWA Hiroyuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: The University of Tokyo

    2009 - 2011

    More details Close

    In this research, the following development is made to establish auto-tuning(AT) facility for high performance execution on several computer environments.(1) Function extension to an AT language, named ABCLibScript, for multicore and massively parallel environment ;(2) Evaluation of the AT facility with multicore CPUs and GPUs ;(3) Evaluation of effectiveness of the AT facility on ABCLibScript by adapting several application software ;(4) Open the codes of preprocessor for the developed ABCLibScript as free software via the internet.

  26. A High-Performance Computing Framework to Exploit Various Processors

    TAKIZAWA Hiroyuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Young Scientists (B)

    Institution: Tohoku University

    2009 - 2010

    More details Close

    The purpose of this work is to achieve a high-performance computing framework that can exploit the computing power of each processor in a heterogeneous computing system while keeping the portability of source codes. For making good use of various computing resources, this work explores an auto-tuning mechanism of a high-level language, numerical libraries seamlessly used from the high-level language, and a job scheduling method.

  27. Acceleration of large-scale data clustering Competitive

    1999/10 - 2009/03

  28. Study on Hardware-Software Collaborative Scheduling for Highly Efficient Multithreading

    KOBAYASHI Hiroaki, NAKAMURA Tadao, SUZUKI Kenichi, TAKIZAWA Hiroyuki, EGAWA Ryusuke, SATO Yukinori, KOTERA Isao, FUNAYA Yusuke, SATO Masayuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2006 - 2009

  29. Large-scale distributed computing with idle computers Competitive

    System: Grant-in-Aid for Scientific Research

    2003/03 - 2008/03

  30. A study of a unified software development scheme in the heterogeneous multicore era

    TAKIZAWA Hiroyuki

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Young Scientists (B)

    Institution: Tohoku University

    2007 - 2008

  31. 安全・安心なボランティアコンピューティングによる超大規模データマイニング

    小林 広明, 滝沢 寛之

    Offer Organization: 日本学術振興会

    System: 科学研究費助成事業

    Category: 特定領域研究

    Institution: 東北大学

    2007 - 2008

    More details Close

    本研究は, 家庭用ゲーム機の機能・性能を活用するボランティアコンピューティングによって, 大規模データマイニングを実現するための基盤技術を確立することを目的としている. 平成20年度には, ロケット噴射ノズル近辺での物理現象の解析を行う分散データマイニングシステムを構築し, PLAYSTATION 3およびInTriggerから構成されるボランティアコンピューティング環境で大規模データマイニングの実証実験を行った. その結果, 動的負荷分散の実施方法として従来通り集中型のタスクスケジューリングを用いる場合, 計算資源の増加に伴い動的負荷分散が効率的に行えなくなり, 大規模ボランティアコンピューティング環境で期待する性能を実現することができないことが示された. 一方, 本研究で提案している分散協調型スケジューリング機構では計算資源の台数が増加しても動的負荷分散を効率的に実施すること可能であることが明らかになった. 本評価実験より, 提案機構が大規模ボランティアコンピューティング環境における動的負荷分散を実現する有効な機構であることが明らかになった. また, 複数のプロジェクトに参加するボランティアが遊休計算能力を浪費しないために, ワーカ側でのスケジューリング手法も提案した. ボランティアコンピューティングの信頼性を高めるための仕組みとして, 計算結果の妥当性を効率的に確認する車法も提案した. 各ワーカの信頼度を定量化し, 計算結果妥当評価に基づいて信頼度を変化させることによって, 不正なワーカを検出できることをシミュレーションにより明らかにした. さらに, 家庭用ゲーム機が高い描画処理性能を有している点に着目し, その描画処理性能をデータマイニングのために利用する方法について検討し, そのようなプログラミングを容易に行うためのプログラミングフレームワークについても研究した.

  32. 安全・安心なボランティアコンピューティングによる超大規模データマイニング

    小林 広明, 滝沢 寛之

    Offer Organization: 日本学術振興会

    System: 科学研究費助成事業

    Category: 特定領域研究

    Institution: 東北大学

    2006 - 2006

    More details Close

    本年度には、代表的なデータマイニング手法の中でも特に高い演算性能が要求されるデータクラスタリング(Data Clustering, DC)とニューラルネットワーク(Neural Networks, NN)に着目し、それらの処理を家庭用ゲーム機で効率良く実行するための実装方法について検討した。具体的には,家庭用ゲーム機に搭載されている高性能プロセッサであるCell Broadband Engine(CBE)や、描画処理ユニット(Graphics Processing Unit, GPU)をデータマイニング処理に効果的に利用する方法について研究し、実装と定量的性能評価を行った。 大規模P2Pコンピューティングに関する研究として、ネットワーク上に遍在する膨大な数の遊休計算機資源から、利用者の要望を満たす計算機資源を効率良く検索するための分散型計算資源管理機構について研究した。研究成果として、利用者からの要望には計算機のメモリアクセスの振舞いに見られるような時間的、空間的な局所性が存在し、それらの局所性を利用することで探索効率の飛躍的改善が可能であることが明らかにした。本年度は特に不均質な環境下での資源探索を考慮し、利用される頻度に応じてP2P通信の接続数を自動調整する仕組みについて検討した。また、膨大な数の計算機を連携させるための仕組みとして、完全分散型の動的負荷分散機構についても研究を進め、その基本制御方式を設計した。 耐タンパー性計算による安全・安心な分散データマイニングシステムをボランティア計算基盤に実現するための準備として、本年度は開発環境の構築を行った。また、関連資料を収集するとともに、関係者との議論を行った。

  33. 多次元時系列データマイニングのためのクラスタリング手法とその並列化

    滝沢 寛之

    Offer Organization: 日本学術振興会

    System: 科学研究費助成事業

    Category: 若手研究(B)

    Institution: 東北大学

    2003 - 2004

    More details Close

    データクラスタリングのためには最近傍のクラスタ探索(最近傍探索)のために高次元ベクトル間の距離計算を多くの回数行う必要があり、大規模な問題に適用する場合にはその計算負荷が大きな課題となる。本研究では平成15年度に、近年のパーソナルコンピュータ(PC)用描画ハードウェア(GPU)の急速な発展に着目し、一般的なGPUを並列プロセッサとして利用すること(GPGPU)で高速な最近傍探索を実現した。さらに、平成16年度はその研究成果を応用して、GPUとCPUとの協調によりデータクラスタリングを高速に行う手法を開発した。この手法は最近傍探索距離の有する2種類の並列性を効果的に利用可能であり、その成果は国際会議において最優秀論文賞を受賞するなど学術的に非常に高く評価された。また、データクラスタリングに適用可能な競合学習をPCクラスタで効果的に並列実行する手法を提案し、その成果が国際学術論文誌に掲載された。 データマイニングの重要な要素である可視化についても引き続き検討し、北海道大学-東北大学間のスーパーSINETによる接続実験により、可視化サーバを対話的に遠隔利用できることを実証実験した。物理的に遠隔地にある演算サーバを利用してクラスタリング処理やその後のボリュームレンダリング等の可視化処理を行い、データマイニングに利用可能であることが実証された。その成果は学術論文誌に掲載予定である。 Chinrunguengらの手法は、部分歪みエントロピを用いてクラスタの最適性を評価することにより平均歪みを最小化する。しかし、適切なクラスタを形成するまでに多数回の繰返し計算が必要であり、時系列データの時間変化に対して迅速に追従できない可能性がある。本研究では、部分歪みエントロピに基づいて適切にクラスタを再配置する手法を新たに提案し、動画像の適応ベクトル量子化に適用することよって追従速度と歪み最小化性能との両立を実現できることを確認した。

  34. An Intelligent Memory Architecture for 3D Graphics

    KOBAYASHI Hiroaki, NAKAMURA Tadao, SUZUKI Ken-ichi, TAKIZAWA Hiroyuki, SANO Kentaro

    Offer Organization: Japan Society for the Promotion of Science

    System: Grants-in-Aid for Scientific Research

    Category: Grant-in-Aid for Scientific Research (B)

    Institution: Tohoku University

    2002 - 2004

    More details Close

    We have the following achievements (1)High-performance graphics algorithm and its hardware We analyzed parallelism and locality of reference in a graphics algorithm based on the global illumination model, and designed a novel rendering pipeline architecture for this algorithm. In addition, we designed and developed a prototype hardware based on the architecture. Through the performance evaluation of the hardware, we showed its effectiveness for realizing interactive ray-tracing. Moreover, we designed a new high-performance algorithm for generating walkthrough animations. (2)Power-efficient memory mechanism For design of the intelligent memory architecture for mobile devices, a low-power mechanism for on-chip memory system was designed. In this mechanism, memory modules are activated and inactivated based on their activity during the program execution. We clarified the relationship between activated memory modules and sustained performance, and showed the effectiveness of power-aware computing for on-chip cache memory. (3)Data compression algorithms for graphics hardware. We applied vector quantization to volume data set to achieve efficient data compression, and designed a visualization algorithm that can directly visualize the compressed volume data. We also designed a novel data compression algorithm using data clustering for graphics hardware

  35. Efficient active learning of neural networks Competitive

    1995/04 - 1999/10

Show all Show first 5

Social Activities 2

  1. GPUコンピューティングセミナー@東北大学

    2009/12/17 -

    More details Close

    企業主催のセミナーにて、関連研究分野の最新の動向と今後の展望について講演

  2. 仙台高等専門学校 専攻研究特別講義

    2009/12/16 -

    More details Close

    仙台高等専門学校広瀬キャンパスにて特別講義

Media Coverage 1

  1. Young HPC Researchers Take Global Stage

    HPCwire

    2014/05/15

    Type: Other

Other 4

  1. ExaFSA

    More details Close

    Developing numerical simulations of Fluid-Structure-Acoustic Iteractions

  2. ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出

    More details Close

    これまでに開発されてきた貴重なソフトウェア資産をポストペタ世代の超並列複合システムへ円滑に移行する方法論の確立は、ここ数年で成し遂げなければならない重要課題であり、その作業を支援する開発環境の実現が強く望まれている。本研究では、既存のソフトウェア資産との親和性やソフトウェア開発の連続性を考慮し、既存のものをベースに新しい環境を創出する進化的アプローチによって超並列複合システム向けの開発環境の実現を目指す。すなわち、言語処理系、ライブラリ、実行時環境、支援ツール群、およびアプリケーションの各レベルで超並列複合システム向けのソフトウェア開発の新技術を開発し、それらに基づく開発環境を実現する。

  3. 対話的物理シミュレーションのラピッドプロトタイピング環境の構築

    More details Close

    本研究の目的は、対話的物理シミュレーションとそれに連携する写実的画像生成アプリケーションの開発を補助するため、現在一般的なゲーム機に搭載されている複数のプロセッサを容易に適材適所で利用可能な開発環境を実現することである。近年、ゲーム機の描画性能は飛躍的に向上し、実物と見間違うほどの画像を対話的に描画することが可能になりつつある。しかし、ゲーム画面が写実的であればあるほど、さらなる高品質な写実的画像を生成するためには物理法則に合わない動きの不自然さが顕著になる。したがって、プレーヤーに仮想現実感を与えるためには、ゲーム画面中に描画される人物や物体が物理法則の観点からみて自然に動く必要があり、対話性が求められるゲームの分野では対話的物理シミュレーションとそれに基づく写実的画像生成が今後ますます重要になる。このため、本研究ではゲーム開発の初期段階において高性能な対話的物理シミュレーションを容易に試作して試行錯誤するための環境を構築する。

  4. ICTエコ社会を創造する安全・安心・安価なユビキタスコンピューティングプラットフォームの研究・開発

    More details Close

    情報通信分野でのエコロジーモデルの確立を目指し、社会に遍在する計算資源として活用する、ユビキタス時代の安心・安全・安価なボランティアコンピューティング基盤を研究開発する。特にボランティアコンピューティングの高効率化、高信頼化、および参加を促進するインセンティブモデルについて研究し、機密性の高い計算にも利用可能で、しかも従来の実装技術では実現困難な規模の大規模計算基盤を安価に提供するための基盤技術を確立する。