研究者詳細

顔写真

コジマ カナメ
小島 要
Kaname Kojima
所属
東北メディカル・メガバンク機構 ゲノム解析部門
職名
講師
学位
  • 博士(情報理工学)(東京大学)

  • 修士(情報理工学)(東京大学)

e-Rad 研究者番号
10646988

学歴 3

  • 東京大学 大学院情報理工学系研究科 コンピュータ科学専攻 博士課程修了

    ~ 2011年3月

  • 東京大学 大学院情報理工学系研究科 コンピュータ科学専攻 修士課程修了

    ~ 2008年3月

  • 東京工業大学 工学部 情報工学科 卒業

    ~ 2005年9月

研究分野 2

  • ライフサイエンス / システムゲノム科学 /

  • 情報通信 / 生命、健康、医療情報学 /

論文 83

  1. Deep learning-based histopathological assessment of tubulo-interstitial injury in chronic kidney diseases 査読有り

    Nonoka Suzuki, Kaname Kojima, Silvia Malvica, Kenshi Yamasaki, Yoichiro Chikamatsu, Yuji Oe, Tasuku Nagasawa, Ekyu Kondo, Satoru Sanada, Setsuya Aiba, Hiroshi Sato, Mariko Miyazaki, Sadayoshi Ito, Mitsuhiro Sato, Tetsuhiro Tanaka, Kengo Kinoshita, Yoshihide Asano, Avi Z. Rosenberg, Koji Okamoto, Kosuke Shido

    Communications Medicine 5 (3) 2025年1月5日

    DOI: 10.1038/s43856-024-00708-3  

  2. Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes 招待有り 査読有り

    Kaname Kojima, Shu Tadaka, Yasunobu Okamura, Kengo Kinoshita

    Journal of Human Genetics 69 (10) 511-518 2024年6月25日

    出版者・発行元: Springer Science and Business Media LLC

    DOI: 10.1038/s10038-024-01261-6  

    ISSN:1434-5161

    eISSN:1435-232X

    詳細を見る 詳細を閉じる

    Abstract Widely used genotype imputation methods are based on the Li and Stephens model, which assumes that new haplotypes can be represented by modifying existing haplotypes in a reference panel through mutations and recombinations. These methods use genotypes from SNP arrays as inputs to estimate haplotypes that align with the input genotypes by analyzing recombination patterns within a reference panel, and then infer unobserved variants. While these methods require reference panels in an identifiable form, their public use is limited due to privacy and consent concerns. One strategy to overcome these limitations is to use de-identified haplotype information, such as summary statistics or model parameters. Advances in deep learning (DL) offer the potential to develop imputation methods that use haplotype information in a reference-free manner by handling it as model parameters, while maintaining comparable imputation accuracy to methods based on the Li and Stephens model. Here, we provide a brief introduction to DL-based reference-free genotype imputation methods, including RNN-IMP, developed by our research group. We then evaluate the performance of RNN-IMP against widely-used Li and Stephens model-based imputation methods in terms of accuracy (R2), using the 1000 Genomes Project Phase 3 dataset and corresponding simulated Omni2.5 SNP genotype data. Although RNN-IMP is sensitive to missing values in input genotypes, we propose a two-stage imputation strategy: missing genotypes are first imputed using denoising autoencoders; RNN-IMP then processes these imputed genotypes. This approach restores the imputation accuracy that is degraded by missing values, enhancing the practical use of RNN-IMP.

  3. Convolutional neural network-based skin image segmentation model to improve classification of skin diseases in conventional and non-standardized picture images 査読有り

    Yuta Yanagisawa, Kosuke Shido, Kaname Kojima, Kenshi Yamasaki

    Journal of Dermatological Science 109 (1) 30-36 2023年1月

    出版者・発行元: Elsevier BV

    DOI: 10.1016/j.jdermsci.2023.01.005  

    ISSN:0923-1811

  4. Ehlers-Danlos syndrome type IV with a novel COL3A1 exon 14 skipping variation confirmed by Tohoku Medical Megabank Organization genomic database. 国際誌 査読有り

    Kosuke Shido, Kaname Kojima, Saaya Yoshida-Akai, Katsuko Kikuchi, Atsushi Hatamochi, Setsuya Aiba, Kenshi Yamasaki

    The Journal of dermatology 48 (12) 1918-1922 2021年12月

    DOI: 10.1111/1346-8138.16131  

    詳細を見る 詳細を閉じる

    A novel COL3A1 variant was identified in a Japanese case of Ehlers-Danlos syndrome type IV (EDS-IV) with a characteristic "Madonna" face, fragile uterus, and easy bruising in addition to a history of cavernous sinus fistula. We confirmed variable diameters of collagen fibrils in the dermis and decrease in type 3 collagen production from cultured fibroblasts. Genomic DNA sequencing of the COL3A1 region and COL3A1 cDNA sequence expressing in cultured fibroblasts identified that a nucleotide variation at c.951+2T>G on intron 14 leads to skipping of exon 14 in COL3A1 cDNA. The novel variation in the splice site of COL3A1 region g.IVS14+2T>G was not listed in the EDS-IV pathogenic genetic databases including Human Gene Mutation Database, ClinVar, and Leiden Open Variation Database. Using the whole genome sequence database of 8380 Japanese individuals reported by the Tohoku Medical Megabank Organization (ToMMo) cohort study, we also confirmed that COL3A1 g.IVS14+2T>G was not a common single nucleotide variation in the Japanese population, although 13 EDS-related COL3A1 variants were identified in the ToMMo database of 8380 Japanese individuals. These results demonstrated that our case of EDS-IV was a result of the novel variation of COL3A1 g.IVS14+2T>G. These statistical genetics approaches with the combination of the ToMMo database of 8380 Japanese individuals and pathogenic genetic databases are a useful method to confirm the uniqueness of novel variation in Japanese.

  5. Genetic loci for lung function in Japanese adults with adjustment for exhaled nitric oxide levels as airway inflammation indicator 国際誌 査読有り

    Mitsuhiro Yamada, Ikuko N Motoike, Kaname Kojima, Nobuo Fuse, Atsushi Hozawa, Shinichi Kuriyama, Fumiki Katsuoka, Shu Tadaka, Matsuyuki Shirota, Miyuki Sakurai, Tomohiro Nakamura, Yohei Hamanaka, Kichiya Suzuki, Junichi Sugawara, Soichi Ogishima, Akira Uruno, Eiichi N Kodama, Naoya Fujino, Tadahisa Numakura, Tomohiro Ichikawa, Ayumi Mitsune, Takashi Ohe, Kengo Kinoshita, Masakazu Ichinose, Hisatoshi Sugiura, Masayuki Yamamoto

    Communications Biology 4 (1) 1288-1288 2021年11月15日

    DOI: 10.1038/s42003-021-02813-8  

    詳細を見る 詳細を閉じる

    Lung function reflects the ability of the respiratory system and is utilized for the assessment of respiratory diseases. Because type 2 airway inflammation influences lung function, genome wide association studies (GWAS) for lung function would be improved by adjustment with an indicator of the inflammation. Here, we performed a GWAS for lung function with adjustment for exhaled nitric oxide (FeNO) levels in two independent Japanese populations. Our GWAS with genotype imputations revealed that the RNF5/AGER locus including AGER rs2070600 SNP, which introduces a G82S substitution of AGER, was the most significantly associated with FEV1/FVC. Three other rare missense variants of AGER were further identified. We also found genetic loci with three candidate genes (NOS2, SPSB2 and RIPOR2) associated with FeNO levels. Analyses with the BioBank-Japan GWAS resource revealed genetic links of FeNO and asthma-related traits, and existence of common genetic background for allergic diseases and their biomarkers. Our study identified the genetic locus most strongly associated with airway obstruction in the Japanese population and three genetic loci associated with FeNO, an indicator of type 2 airway inflammation in adults.

  6. GWAS Identified IL4R and the Major Histocompatibility Complex Region as the Associated Loci of Total Serum IgE Levels in 9,260 Japanese Individuals 国際誌 査読有り

    Kosuke Shido, Kaname Kojima, Matsuyuki Shirota, Kenshi Yamasaki, Ikuko N Motoike, Atsushi Hozawa, Soichi Ogishima, Naoko Minegishi, Kozo Tanno, Fumiki Katsuoka, Gen Tamiya, Setsuya Aiba, Masayuki Yamamoto, Kengo Kinoshita

    Journal of Investigative Dermatology 141 (11) 2749-2752 2021年11月

    DOI: 10.1016/j.jid.2021.02.762  

  7. Facial UV photo imaging for skin pigmentation assessment using conditional generative adversarial networks 国際誌 査読有り

    Kaname Kojima, Kosuke Shido, Gen Tamiya, Kenshi Yamasaki, Kengo Kinoshita, Setsuya Aiba

    Scientific Reports 11 (1) 1213-1213 2021年1月13日

    DOI: 10.1038/s41598-020-79995-4  

    詳細を見る 詳細を閉じる

    Skin pigmentation is associated with skin damages and skin cancers, and ultraviolet (UV) photography is used as a minimally invasive mean for the assessment of pigmentation. Since UV photography equipment is not usually available in general practice, technologies emphasizing pigmentation in color photo images are desired for daily care. We propose a new method using conditional generative adversarial networks, named UV-photo Net, to generate synthetic UV images from color photo images. Evaluations using color and UV photo image pairs taken by a UV photography system demonstrated that pigment spots were well reproduced in synthetic UV images by UV-photo Net, and some of the reproduced pigment spots were difficult to be recognized in color photo images. In the pigment spot detection analysis, the rate of pigment spot areas in cheek regions for synthetic UV images was highly correlated with the rate for UV photo images (Pearson's correlation coefficient 0.92). We also demonstrated that UV-photo Net was effective for floating up pigment spots for photo images taken by a smartphone camera. UV-photo Net enables an easy assessment of pigmentation from color photo images and will promote self-care of skin damages and early signs of skin cancers for preventive medicine.

  8. A genotype imputation method for de-identified haplotype reference information by using recurrent neural network. 国際誌 査読有り

    Kaname Kojima, Shu Tadaka, Fumiki Katsuoka, Gen Tamiya, Masayuki Yamamoto, Kengo Kinoshita

    PLoS Computational Biology 16 (10) e1008207 2020年10月

    DOI: 10.1371/journal.pcbi.1008207  

    詳細を見る 詳細を閉じる

    Genotype imputation estimates the genotypes of unobserved variants using the genotype data of other observed variants based on a collection of haplotypes for thousands of individuals, which is known as a haplotype reference panel. In general, more accurate imputation results were obtained using a larger size of haplotype reference panel. Most of the existing genotype imputation methods explicitly require the haplotype reference panel in precise form, but the accessibility of haplotype data is often limited, due to the requirement of agreements from the donors. Since de-identified information such as summary statistics or model parameters can be used publicly, imputation methods using de-identified haplotype reference information might be useful to enhance the quality of imputation results under the condition where the access of the haplotype data is limited. In this study, we proposed a novel imputation method that handles the reference panel as its model parameters by using bidirectional recurrent neural network (RNN). The model parameters are presented in the form of de-identified information from which the restoration of the genotype data at the individual-level is almost impossible. We demonstrated that the proposed method provides comparable imputation accuracy when compared with the existing imputation methods using haplotype datasets from the 1000 Genomes Project (1KGP) and the Haplotype Reference Consortium. We also considered a scenario where a subset of haplotypes is made available only in de-identified form for the haplotype reference panel. In the evaluation using the 1KGP dataset under the scenario, the imputation accuracy of the proposed method is much higher than that of the existing imputation methods. We therefore conclude that our RNN-based method is quite promising to further promote the data-sharing of sensitive genome data under the recent movement for the protection of individuals' privacy.

  9. Susceptibility Loci for Tanning Ability in the Japanese Population Identified by a Genome-Wide Association Study from the Tohoku Medical Megabank Project Cohort Study 国際誌 査読有り

    Kosuke Shido, Kaname Kojima, Kenshi Yamasaki, Atsushi Hozawa, Gen Tamiya, Soichi Ogishima, Naoko Minegishi, Yosuke Kawai, Kozo Tanno, Yoichi Suzuki, Masao Nagasaki, Setsuya Aiba

    Journal of Investigative Dermatology 139 (7) 1605-1608 2019年7月

    DOI: 10.1016/j.jid.2019.01.015  

  10. Genome-wide association studies identify PRKCB as a novel genetic susceptibility locus for primary biliary cholangitis in the Japanese population. 国際誌 査読有り

    Minae Kawashima, Yuki Hitomi, Yoshihiro Aiba, Nao Nishida, Kaname Kojima, Yosuke Kawai, Hitomi Nakamura, Atsushi Tanaka, Mikio Zeniya, Etsuko Hashimoto, Hiromasa Ohira, Kazuhide Yamamoto, Masanori Abe, Kazuhiko Nakao, Satoshi Yamagiwa, Shuichi Kaneko, Masao Honda, Takeji Umemura, Takafumi Ichida, Masataka Seike, Shotaro Sakisaka, Masaru Harada, Osamu Yokosuka, Yoshiyuki Ueno, Michio Senju, Tatsuo Kanda, Hidetaka Shibata, Takashi Himoto, Kazumoto Murata, Yasuhiro Miyake, Hirotoshi Ebinuma, Makiko Taniai, Satoru Joshita, Toshiki Nikami, Hajime Ota, Hiroshi Kouno, Hirotaka Kouno, Makoto Nakamuta, Nobuyoshi Fukushima, Motoyuki Kohjima, Tatsuji Komatsu, Toshiki Komeda, Yukio Ohara, Toyokichi Muro, Tsutomu Yamashita, Kaname Yoshizawa, Yoko Nakamura, Masaaki Shimada, Noboru Hirashima, Kazuhiro Sugi, Keisuke Ario, Eiichi Takesaki, Atsushi Naganuma, Hiroshi Mano, Haruhiro Yamashita, Kouki Matsushita, Kazuhiko Yamauchi, Fujio Makita, Hideo Nishimura, Kiyoshi Furuta, Naohiro Takahashi, Masahiro Kikuchi, Naohiko Masaki, Tomohiro Tanaka, Sumito Tamura, Akira Mori, Shintaro Yagi, Ken Shirabe, Atsumasa Komori, Kiyoshi Migita, Masahiro Ito, Shinya Nagaoka, Seigo Abiru, Hiroshi Yatsuhashi, Michio Yasunami, Shinji Shimoda, Kenichi Harada, Hiroto Egawa, Yoshihiko Maehara, Shinji Uemoto, Norihiro Kokudo, Hajime Takikawa, Hiromi Ishibashi, Kazuaki Chayama, Masashi Mizokami, Masao Nagasaki, Katsushi Tokunaga, Minoru Nakamura

    Human molecular genetics 26 (3) 650-659 2017年2月1日

    DOI: 10.1093/hmg/ddw406  

    詳細を見る 詳細を閉じる

    A previous genome-wide association study (GWAS) performed in 963 Japanese individuals (487 primary biliary cholangitis [PBC] cases and 476 healthy controls) identified TNFSF15 (rs4979462) and POU2AF1 (rs4938534) as strong susceptibility loci for PBC. In this study, we performed GWAS in additional 1,923 Japanese individuals (894 PBC cases and 1,029 healthy controls), and combined the results with the previous data. This GWAS, together with a subsequent replication study in an independent set of 7,024 Japanese individuals (512 PBC cases and 6,512 healthy controls), identified PRKCB (rs7404928) as a novel susceptibility locus for PBC (odds ratio [OR] = 1.26, P = 4.13 × 10-9). Furthermore, a primary functional variant of PRKCB (rs35015313) was identified by genotype imputation using a phased panel of 1,070 Japanese individuals from a prospective, general population cohort study and subsequent in vitro functional analyses. These results may lead to improved understanding of the disease pathways involved in PBC, forming a basis for prevention of PBC and development of novel therapeutics.

  11. STR-realigner: a realignment method for short tandem repeat regions 国際誌 査読有り

    Kaname Kojima, Yosuke Kawai, Kazuharu Misawa, Takahiro Mimori, Masao Nagasaki

    BMC Genomics 17 (1) 991-991 2016年12月3日

    DOI: 10.1186/s12864-016-3294-x  

    eISSN:1471-2164

    詳細を見る 詳細を閉じる

    BACKGROUND: In the estimation of repeat numbers in a short tandem repeat (STR) region from high-throughput sequencing data, two types of strategies are mainly taken: a strategy based on counting repeat patterns included in sequence reads spanning the region and a strategy based on estimating the difference between the actual insert size and the insert size inferred from paired-end reads. The quality of sequence alignment is crucial, especially in the former approaches although usual alignment methods have difficulty in STR regions due to insertions and deletions caused by the variations of repeat numbers. RESULTS: We proposed a new dynamic programming based realignment method named STR-realigner that considers repeat patterns in STR regions as prior knowledge. By allowing the size change of repeat patterns with low penalty in STR regions, accurate realignment is expected. For the performance evaluation, publicly available STR variant calling tools were applied to three types of aligned reads: synthetically generated sequencing reads aligned with BWA-MEM, those realigned with STR-realigner, those realigned with ReviSTER, and those realigned with GATK IndelRealigner. From the comparison of root mean squared errors between estimated and true STR region size, the results for the dataset realigned with STR-realigner are better than those for other cases. For real data analysis, we used a real sequencing dataset from Illumina HiSeq 2000 for a parent-offspring trio. RepeatSeq and lobSTR were applied to the sequence reads for these individuals aligned with BWA-MEM, those realigned with STR-realigner, ReviSTER, and GATK IndelRealigner. STR-realigner shows the best performance in terms of consistency of the size of estimated STR regions in Mendelian inheritance. Root mean squared error values were also calculated from the comparison of these estimated results with STR region sizes obtained from high coverage PacBio sequencing data, and the results from the realigned sequencing data with STR-realigner showed the least (the best) root mean squared error value. CONCLUSIONS: The effectiveness of the proposed realignment method for STR regions was verified from the comparison with an existing method on both simulation datasets and real whole genome sequencing dataset.

  12. Short tandem repeat number estimation from paired-end reads for multiple individuals by considering coalescent tree. 国際誌 査読有り

    Kaname Kojima, Yosuke Kawai, Naoki Nariai, Takahiro Mimori, Takanori Hasegawa, Masao Nagasaki

    BMC Genomics 17 Suppl 5 494-494 2016年8月31日

    DOI: 10.1186/s12864-016-2821-0  

    詳細を見る 詳細を閉じる

    BACKGROUND: Two types of approaches are mainly considered for the repeat number estimation in short tandem repeat (STR) regions from high-throughput sequencing data: approaches directly counting repeat patterns included in sequence reads spanning the region and approaches based on detecting the difference between the insert size inferred from aligned paired-end reads and the actual insert size. Although the accuracy of repeat numbers estimated with the former approaches is high, the size of target STR regions is limited to the length of sequence reads. On the other hand, the latter approaches can handle STR regions longer than the length of sequence reads. However, repeat numbers estimated with the latter approaches is less accurate than those with the former approaches. RESULTS: We proposed a new statistical model named coalescentSTR that estimates repeat numbers from paired-end read distances for multiple individuals simultaneously by connecting the read generative model for each individual with their genealogy. In the model, the genealogy is represented by handling coalescent trees as hidden variables, and the summation of the hidden variables is taken on coalescent trees sampled based on phased genotypes located around a target STR region with Markov chain Monte Carlo. In the sampled coalescent trees, repeat number information from insert size data is propagated, and more accurate estimation of repeat numbers is expected for STR regions longer than the length of sequence reads. For finding the repeat numbers maximizing the likelihood of the model on the estimation of repeat numbers, we proposed a state-of-the-art belief propagation algorithm on sampled coalescent trees. CONCLUSIONS: We verified the effectiveness of the proposed approach from the comparison with existing methods by using simulation datasets and real whole genome and whole exome data for HapMap individuals analyzed in the 1000 Genomes Project.

  13. Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals. 国際誌 査読有り

    Masao Nagasaki, Jun Yasuda, Fumiki Katsuoka, Naoki Nariai, Kaname Kojima, Yosuke Kawai, Yumi Yamaguchi-Kabata, Junji Yokozawa, Inaho Danjoh, Sakae Saito, Yukuto Sato, Takahiro Mimori, Kaoru Tsuda, Rumiko Saito, Xiaoqing Pan, Satoshi Nishikawa, Shin Ito, Yoko Kuroki, Osamu Tanabe, Nobuo Fuse, Shinichi Kuriyama, Hideyasu Kiyomoto, Atsushi Hozawa, Naoko Minegishi, James Douglas Engel, Kengo Kinoshita, Shigeo Kure, Nobuo Yaegashi, Masayuki Yamamoto

    Nature Communications 6 8018-8018 2015年8月21日

    DOI: 10.1038/ncomms9018  

    詳細を見る 詳細を閉じる

    The Tohoku Medical Megabank Organization reports the whole-genome sequences of 1,070 healthy Japanese individuals and construction of a Japanese population reference panel (1KJPN). Here we identify through this high-coverage sequencing (32.4 × on average), 21.2 million, including 12 million novel, single-nucleotide variants (SNVs) at an estimated false discovery rate of <1.0%. This detailed analysis detected signatures for purifying selection on regulatory elements as well as coding regions. We also catalogue structural variants, including 3.4 million insertions and deletions, and 25,923 genic copy-number variants. The 1KJPN was effective for imputing genotypes of the Japanese population genome wide. These data demonstrate the value of high-coverage sequencing for constructing population-specific variant panels, which covers 99.0% SNVs of minor allele frequency ≥0.1%, and its value for identifying causal rare variants of complex human disease phenotypes in genetic association studies.

  14. A statistical variant calling approach from pedigree information and local haplotyping with phase informative reads. 国際誌 査読有り

    Kaname Kojima, Naoki Nariai, Takahiro Mimori, Mamoru Takahashi, Yumi Yamaguchi-Kabata, Yukuto Sato, Masao Nagasaki

    Bioinformatics 29 (22) 2835-43 2013年11月15日

    DOI: 10.1093/bioinformatics/btt503  

    詳細を見る 詳細を閉じる

    MOTIVATION: Variant calling from genome-wide sequencing data is essential for the analysis of disease-causing mutations and elucidation of disease mechanisms. However, variant calling in low coverage regions is difficult due to sequence read errors and mapping errors. Hence, variant calling approaches that are robust to low coverage data are demanded. RESULTS: We propose a new variant calling approach that considers pedigree information and haplotyping based on sequence reads spanning two or more heterozygous positions termed phase informative reads. In our approach, genotyping and haplotyping by the assignment of each read to a haplotype based on phase informative reads are simultaneously performed. Therefore, positions with low evidence for heterozygosity are rescued by phase informative reads, and such rescued positions contribute to haplotyping in a synergistic way. In addition, pedigree information supports more accurate haplotyping as well as genotyping, especially in low coverage regions. Although heterozygous positions are useful for haplotyping, homozygous positions are not informative and weaken the information from heterozygous positions, as majority of positions are homozygous. Thus, we introduce latent variables that determine zygosity at each position to filter out homozygous positions for haplotyping. In performance evaluation with a parent-offspring trio sequencing data, our approach outperforms existing approaches in accuracy on the agreement with single nucleotide polymorphism array genotyping results. Also, performance analysis considering distance between variants showed that the use of phase informative reads is effective for accurate variant calling, and further performance improvement is expected with longer sequencing data. CONTACT: kojima@megabank.tohoku.ac.jp .

  15. Identifying regulational alterations in gene regulatory networks by state space representation of vector autoregressive models and variational annealing. 国際誌 査読有り

    Kaname Kojima, Seiya Imoto, Rui Yamaguchi, André Fujita, Mai Yamauchi, Noriko Gotoh, Satoru Miyano

    BMC Genomics 13 Suppl 1 S6 2012年

    DOI: 10.1186/1471-2164-13-S1-S6  

    詳細を見る 詳細を閉じる

    BACKGROUND: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded. METHODS: We propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood. RESULTS: For the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib. CONCLUSIONS: From the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.

  16. An efficient biological pathway layout algorithm combining grid-layout and spring embedder for complicated cellular location information. 国際誌 査読有り

    Kaname Kojima, Masao Nagasaki, Satoru Miyano

    BMC Bioinformatics 11 335-335 2010年6月18日

    DOI: 10.1186/1471-2105-11-335  

    詳細を見る 詳細を閉じる

    BACKGROUND: Graph drawing is one of the important techniques for understanding biological regulations in a cell or among cells at the pathway level. Among many available layout algorithms, the spring embedder algorithm is widely used not only for pathway drawing but also for circuit placement and www visualization and so on because of the harmonized appearance of its results. For pathway drawing, location information is essential for its comprehension. However, complex shapes need to be taken into account when torus-shaped location information such as nuclear inner membrane, nuclear outer membrane, and plasma membrane is considered. Unfortunately, the spring embedder algorithm cannot easily handle such information. In addition, crossings between edges and nodes are usually not considered explicitly. RESULTS: We proposed a new grid-layout algorithm based on the spring embedder algorithm that can handle location information and provide layouts with harmonized appearance. In grid-layout algorithms, the mapping of nodes to grid points that minimizes a cost function is searched. By imposing positional constraints on grid points, location information including complex shapes can be easily considered. Our layout algorithm includes the spring embedder cost as a component of the cost function. We further extend the layout algorithm to enable dynamic update of the positions and sizes of compartments at each step. CONCLUSIONS: The new spring embedder-based grid-layout algorithm and a spring embedder algorithm are applied to three biological pathways; endothelial cell model, Fas-induced apoptosis model, and C. elegans cell fate simulation model. From the positional constraints, all the results of our algorithm satisfy location information, and hence, more comprehensible layouts are obtained as compared to the spring embedder algorithm. From the comparison of the number of crossings, the results of the grid-layout-based algorithm tend to contain more crossings than those of the spring embedder algorithm due to the positional constraints. For a fair comparison, we also apply our proposed method without positional constraints. This comparison shows that these results contain less crossings than those of the spring embedder algorithm. We also compared layouts of the proposed algorithm with and without compartment update and verified that latter can reach better local optima.

  17. Optimal search on clustered structural constraint for learning bayesian network structure 査読有り

    Kaname Kojima, Eric Perrier, Seiya Imoto, Satoru Miyano

    Journal of Machine Learning Research 11 285-310 2010年2月

  18. A state space representation of VAR models with sparse learning for dynamic gene networks.

    Kaname Kojima, Rui Yamaguchi, Seiya Imoto, Mai Yamauchi, Masao Nagasaki, Ryo Yoshida, Teppei Shimamura, Kazuko Ueno, Tomoyuki Higuchi, Noriko Gotoh, Satoru Miyano

    Genome informatics. International Conference on Genome Informatics 22 56-68 2010年1月

    ISSN:0919-9454

    詳細を見る 詳細を閉じる

    We propose a state space representation of vector autoregressive model and its sparse learning based on L1 regularization to achieve efficient estimation of dynamic gene networks based on time course microarray data. The proposed method can overcome drawbacks of the vector autoregressive model and state space model; the assumption of equal time interval and lack of separation ability of observation and systems noises in the former method and the assumption of modularity of network structure in the latter method. However, in a simple implementation the proposed model requires the calculation of large inverse matrices in a large number of times during parameter estimation process based on EM algorithm. This limits the applicability of the proposed method to a relatively small gene set. We thus introduce a new calculation technique for EM algorithm that does not require the calculation of inverse matrices. The proposed method is applied to time course microarray data of lung cells treated by stimulating EGF receptors and dosing an anticancer drug, Gefitinib. By comparing the estimated network with the control network estimated using non-treated lung cells, perturbed genes by the anticancer drug could be found, whose up- and down-stream genes in the estimated networks may be related to side effects of the anticancer drug.

  19. Gene regulatory network clustering for graph layout based on microarray gene expression data.

    Kaname Kojima, Seiya Imoto, Masao Nagasaki, Satoru Miyano

    Genome informatics. International Conference on Genome Informatics 24 84-95 2010年

    ISSN:0919-9454

    詳細を見る 詳細を閉じる

    We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.

  20. Fast grid layout algorithm for biological networks with sweep calculation. 国際誌 査読有り

    Kaname Kojima, Masao Nagasaki, Satoru Miyano

    Bioinformatics 24 (12) 1433-41 2008年6月15日

    DOI: 10.1093/bioinformatics/btn196  

    詳細を見る 詳細を閉じる

    MOTIVATION: Properly drawn biological networks are of great help in the comprehension of their characteristics. The quality of the layouts for retrieved biological networks is critical for pathway databases. However, since it is unrealistic to manually draw biological networks for every retrieval, automatic drawing algorithms are essential. Grid layout algorithms handle various biological properties such as aligning vertices having the same attributes and complicated positional constraints according to their subcellular localizations; thus, they succeed in providing biologically comprehensible layouts. However, existing grid layout algorithms are not suitable for real-time drawing, which is one of requisites for applications to pathway databases, due to their high-computational cost. In addition, they do not consider edge directions and their resulting layouts lack traceability for biochemical reactions and gene regulations, which are the most important features in biological networks. RESULTS: We devise a new calculation method termed sweep calculation and reduce the time complexity of the current grid layout algorithms through its encoding and decoding processes. We conduct practical experiments by using 95 pathway models of various sizes from TRANSPATH and show that our new grid layout algorithm is much faster than existing grid layout algorithms. For the cost function, we introduce a new component that penalizes undesirable edge directions to avoid the lack of traceability in pathways due to the differences in direction between in-edges and out-edges of each vertex. AVAILABILITY: Java implementations of our layout algorithms are available in Cell Illustrator. CONTACT: masao@ims.u-tokyo.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  21. Estimation of nonlinear gene regulatory networks via L1 regularized NVAR from time series gene expression data.

    Kaname Kojima, André Fujita, Teppei Shimamura, Seiya Imoto, Satoru Miyano

    Genome informatics. International Conference on Genome Informatics 20 37-51 2008年

    ISSN:0919-9454

    詳細を見る 詳細を閉じる

    Recently, nonlinear vector autoregressive (NVAR) model based on Granger causality was proposed to infer nonlinear gene regulatory networks from time series gene expression data. Since NVAR requires a large number of parameters due to the basis expansion, the length of time series microarray data is insufficient for accurate parameter estimation and we need to limit the size of the gene set strongly. To address this limitation, we employ L1 regularization technique to estimate NVAR. Under L1 regularization, direct parents of each gene can be selected efficiently even when the number of parameters exceeds the number of data samples. We can thus estimate larger gene regulatory networks more accurately than those from existing methods. Through the simulation study, we verify the effectiveness of the proposed method by comparing its limitation in the number of genes to that of the existing NVAR. The proposed method is also applied to time series microarray data of Human hela cell cycle.

  22. An efficient grid layout algorithm for biological networks utilizing various biological attributes. 国際誌 査読有り

    Kaname Kojima, Masao Nagasaki, Euna Jeong, Mitsuru Kato, Satoru Miyano

    BMC Bioinformatics 8 76-76 2007年3月6日

    eISSN:1471-2105

    詳細を見る 詳細を閉じる

    BACKGROUND: Clearly visualized biopathways provide a great help in understanding biological systems. However, manual drawing of large-scale biopathways is time consuming. We proposed a grid layout algorithm that can handle gene-regulatory networks and signal transduction pathways by considering edge-edge crossing, node-edge crossing, distance measure between nodes, and subcellular localization information from Gene Ontology. Consequently, the layout algorithm succeeded in drastically reducing these crossings in the apoptosis model. However, for larger-scale networks, we encountered three problems: (i) the initial layout is often very far from any local optimum because nodes are initially placed at random, (ii) from a biological viewpoint, human layouts still exceed automatic layouts in understanding because except subcellular localization, it does not fully utilize biological information of pathways, and (iii) it employs a local search strategy in which the neighborhood is obtained by moving one node at each step, and automatic layouts suggest that simultaneous movements of multiple nodes are necessary for better layouts, while such extension may face worsening the time complexity. RESULTS: We propose a new grid layout algorithm. To address problem (i), we devised a new force-directed algorithm whose output is suitable as the initial layout. For (ii), we considered that an appropriate alignment of nodes having the same biological attribute is one of the most important factors of the comprehension, and we defined a new score function that gives an advantage to such configurations. For solving problem (iii), we developed a search strategy that considers swapping nodes as well as moving a node, while keeping the order of the time complexity. Though a naïve implementation increases by one order, the time complexity, we solved this difficulty by devising a method that caches differences between scores of a layout and its possible updates. CONCLUSION: Layouts of the new grid layout algorithm are compared with that of the previous algorithm and human layout in an endothelial cell model, three times as large as the apoptosis model. The total cost of the result from the new grid layout algorithm is similar to that of the human layout. In addition, its convergence time is drastically reduced (40% reduction).

  23. jMorp: Japanese Multi-Omics Reference Panel update report 2023. 国際誌

    Shu Tadaka, Junko Kawashima, Eiji Hishinuma, Sakae Saito, Yasunobu Okamura, Akihito Otsuki, Kaname Kojima, Shohei Komaki, Yuichi Aoki, Takanari Kanno, Daisuke Saigusa, Jin Inoue, Matsuyuki Shirota, Jun Takayama, Fumiki Katsuoka, Atsushi Shimizu, Gen Tamiya, Ritsuko Shimizu, Masahiro Hiratsuka, Ikuko N Motoike, Seizo Koshiba, Makoto Sasaki, Masayuki Yamamoto, Kengo Kinoshita

    Nucleic acids research 2023年11月1日

    DOI: 10.1093/nar/gkad978  

    詳細を見る 詳細を閉じる

    Modern medicine is increasingly focused on personalized medicine, and multi-omics data is crucial in understanding biological phenomena and disease mechanisms. Each ethnic group has its unique genetic background with specific genomic variations influencing disease risk and drug response. Therefore, multi-omics data from specific ethnic populations are essential for the effective implementation of personalized medicine. Various prospective cohort studies, such as the UK Biobank, All of Us and Lifelines, have been conducted worldwide. The Tohoku Medical Megabank project was initiated after the Great East Japan Earthquake in 2011. It collects biological specimens and conducts genome and omics analyses to build a basis for personalized medicine. Summary statistical data from these analyses are available in the jMorp web database (https://jmorp.megabank.tohoku.ac.jp), which provides a multidimensional approach to the diversity of the Japanese population. jMorp was launched in 2015 as a public database for plasma metabolome and proteome analyses and has been continuously updated. The current update will significantly expand the scale of the data (metabolome, genome, transcriptome, and metagenome). In addition, the user interface and backend server implementations were rewritten to improve the connectivity between the items stored in jMorp. This paper provides an overview of the new version of the jMorp.

  24. Sex-Specific Differences in the Transcriptome of the Human Dorsolateral Prefrontal Cortex in Schizophrenia. 国際誌

    Zhiqian Yu, Kazuko Ueno, Ryo Funayama, Mai Sakai, Naoki Nariai, Kaname Kojima, Yoshie Kikuchi, Xue Li, Chiaki Ono, Junpei Kanatani, Jiro Ono, Kazuya Iwamoto, Kenji Hashimoto, Kengo Kinoshita, Keiko Nakayama, Masao Nagasaki, Hiroaki Tomita

    Molecular neurobiology 2022年11月22日

    DOI: 10.1007/s12035-022-03109-6  

    詳細を見る 詳細を閉じる

    Schizophrenia presents clinical and biological differences between males and females. This study investigated transcriptional profiles in the dorsolateral prefrontal cortex (DLPFC) using postmortem data from the largest RNA-sequencing (RNA-seq) database on schizophrenic cases and controls. Data for 154 male and 113 female controls and 160 male and 93 female schizophrenic cases were obtained from the CommonMind Consortium. In the RNA-seq database, the principal component analysis showed that sex effects were small in schizophrenia. After we analyzed the impact of sex-specific differences on gene expression, the female group showed more significantly changed genes compared with the male group. Based on the gene ontology analysis, the female sex-specific genes that changed were overrepresented in the mitochondrion, ATP (phosphocreatine and adenosine triphosphate)-, and metal ion-binding relevant biological processes. An ingenuity pathway analysis revealed that the differentially expressed genes related to schizophrenia in the female group were involved in midbrain dopaminergic and γ-aminobutyric acid (GABA)-ergic neurons and microglia. We used methylated DNA-binding domain-sequencing analyses and microarray to investigate the DNA methylation that potentially impacts the sex differences in gene transcription using a maternal immune activation (MIA) murine model. Among the sex-specific positional genes related to schizophrenia in the PFC of female offspring from MIA, the changes in the methylation and transcriptional expression of loci ACSBG1 were validated in the females with schizophrenia in independent postmortem samples by real-time PCR and pyrosequencing. Our results reveal potential genetic risks in the DLPFC for the sex-dependent prevalence and symptomology of schizophrenia.

  25. 東北メディカル・メガバンク計画健康調査における眼軸長のゲノムワイド関連解析

    布施 昇男, 櫻井 美由紀, 元池 育子, 小島 要, 平良 摩紀子, 宇留野 晃, 濱中 洋平, 中村 智洋, 荻島 創一, 寳澤 篤, 栗山 進一, 呉 繁夫, 木下 賢吾, 山本 雅之

    日本眼科学会雑誌 126 (臨増) 216-216 2022年3月

    出版者・発行元: (公財)日本眼科学会

    ISSN:0029-0203

  26. Genome-wide Association Study of Axial Length in Population-based Cohorts in Japan: The Tohoku Medical Megabank Organization Eye Study. 国際誌

    Nobuo Fuse, Miyuki Sakurai, Ikuko N Motoike, Kaname Kojima, Takako Takai-Igarashi, Naoki Nakaya, Naho Tsuchiya, Tomohiro Nakamura, Mami Ishikuro, Taku Obara, Akiko Miyazawa, Kei Homma, Keisuke Ido, Makiko Taira, Tomoko Kobayashi, Ritsuko Shimizu, Akira Uruno, Eiichi N Kodama, Kichiya Suzuki, Yohei Hamanaka, Hiroaki Tomita, Junichi Sugawara, Yoichi Suzuki, Fuji Nagami, Soichi Ogishima, Fumiki Katsuoka, Naoko Minegishi, Atsushi Hozawa, Shinichi Kuriyama, Nobuo Yaegashi, Shigeo Kure, Kengo Kinoshita, Masayuki Yamamoto

    Ophthalmology science 2 (1) 100113-100113 2022年3月

    出版者・発行元: Elsevier BV

    DOI: 10.1016/j.xops.2022.100113  

    ISSN:2666-9145

    詳細を見る 詳細を閉じる

    PURPOSE: To elucidate the differences in ocular biometric parameters by generation and gender and to identify axial length (AL)-associated genetic variants in Japanese individuals, we analyzed Tohoku Medical Megabank Organization (ToMMo) Eye Study data. DESIGN: We designed the ToMMo Eye Study, examined AL variations, and conducted genome-wide association studies (GWASs). PARTICIPANTS: In total, 33 483 participants aged > 18 years who were recruited into the community-based cohort (CommCohort) and the birth and three-generation cohort (BirThree Cohort) of the ToMMo Eye Study were examined. METHODS: Each participant was screened with an interview, ophthalmic examinations, and a microarray analysis. The GWASs were performed in 22 379 participants in the CommCohort (discovery stage) and 11 104 participants in the BirThree Cohort (replication stage). We evaluated the associations of single nucleotide polymorphisms (SNPs) with AL using a genome-wide significance threshold (5 × 10-8) in each stage of the study and in the subsequent meta-analysis. MAIN OUTCOME MEASURES: We identified the association of SNPs with AL and distributions of AL in right and left eyes and individuals of different sexes and ages. RESULTS: In the discovery stage, the mean AL of the right eye (23.99 mm) was significantly greater than that of the left eye (23.95 mm). This difference was reproducible across sexes and ages. The GWASs revealed 703 and 215 AL-associated SNPs with genome-wide significance in the discovery and validation stages, respectively, and many of the SNPs in the discovery stage were replicated in the validation stage. Validated SNPs and their associated loci were meta-analyzed for statistical significance (P < 5 × 10-8). This study identified 1478 SNPs spread over 31 loci. Of the 31 loci, 5 are known AL loci, 15 are known refractive-error loci, 4 are known corneal-curvature loci, and 7 loci are newly identified loci that are not known to be associated with AL. Of note, some of them shared functional relationships with previously identified loci. CONCLUSIONS: Our large-scale GWASs exploiting ToMMo Eye Study data identified 31 loci linked to variations in AL, 7 of which are newly reported in this article. The results revealed genetic heterogeneity and similarity in SNPs related to ethnic variations in AL.

  27. まれな変異に重点を置いた尿酸値の遺伝率の研究

    三澤 計治, 三島 英換, 長谷川 嵩矩, 大内 基司, 小島 要, 河合 洋介, 松尾 雅文, 安西 尚彦, 長﨑 正朗

    痛風と尿酸・核酸 45 (1) 23-30 2021年7月25日

    出版者・発行元: (一社)日本痛風・尿酸核酸学会

    eISSN:2435-0095

  28. rs1944919 on chromosome 11q23.1 and its effector genes COLCA1/COLCA2 confer susceptibility to primary biliary cholangitis. 国際誌

    Yuki Hitomi, Yoshihiro Aiba, Yosuke Kawai, Kaname Kojima, Kazuko Ueno, Nao Nishida, Minae Kawashima, Olivier Gervais, Seik-Soon Khor, Masao Nagasaki, Katsushi Tokunaga, Minoru Nakamura, Makoto Tsuiji

    Scientific reports 11 (1) 4557-4557 2021年2月25日

    DOI: 10.1038/s41598-021-84042-x  

    詳細を見る 詳細を閉じる

    Primary biliary cholangitis (PBC) is a chronic, progressive cholestatic liver disease in which intrahepatic bile ducts are destroyed by an autoimmune reaction. Our previous genome-wide association study (GWAS) identified chromosome 11q23.1 as a susceptibility gene locus for PBC in the Japanese population. Here, high-density association mapping based on single nucleotide polymorphism (SNP) imputation and in silico/in vitro functional analyses identified rs1944919 as the primary functional variant. Expression-quantitative trait loci analyses showed that the PBC susceptibility allele of rs1944919 was significantly associated with increased COLCA1/COLCA2 expression levels. Additionally, the effects of rs1944919 on COLCA1/COLCA2 expression levels were confirmed using genotype knock-in versions of cell lines constructed using the CRISPR/Cas9 system and differed between rs1944919-G/G clones and -T/T clones. To our knowledge, this is the first study to demonstrate the contribution of COLCA1/COLCA2 to PBC susceptibility.

  29. A pro-diabetogenic mtDNA polymorphism in the mitochondrial-derived peptide, MOTS-c. 国際誌

    Hirofumi Zempo, Su-Jeong Kim, Noriyuki Fuku, Yuichiro Nishida, Yasuki Higaki, Junxiang Wan, Kelvin Yen, Brendan Miller, Roberto Vicinanza, Eri Miyamoto-Mikami, Hiroshi Kumagai, Hisashi Naito, Jialin Xiao, Hemal H Mehta, Changhan Lee, Megumi Hara, Yesha M Patel, Veronica W Setiawan, Timothy M Moore, Andrea L Hevener, Yoichi Sutoh, Atsushi Shimizu, Kaname Kojima, Kengo Kinoshita, Yasumichi Arai, Nobuyoshi Hirose, Seiji Maeda, Keitaro Tanaka, Pinchas Cohen

    Aging 13 (2) 1692-1717 2021年1月19日

    DOI: 10.18632/aging.202529  

    詳細を見る 詳細を閉じる

    Type 2 Diabetes (T2D) is an emerging public health problem in Asia. Although ethnic specific mtDNA polymorphisms have been shown to contribute to T2D risk, the functional effects of the mtDNA polymorphisms and the therapeutic potential of mitochondrial-derived peptides at the mtDNA polymorphisms are underexplored. Here, we showed an Asian-specific mitochondrial DNA variation m.1382A>C (rs111033358) leads to a K14Q amino acid replacement in MOTS-c, an insulin sensitizing mitochondrial-derived peptide. Meta-analysis of three cohorts (n = 27,527, J-MICC, MEC, and TMM) show that males but not females with the C-allele exhibit a higher prevalence of T2D. In J-MICC, only males with the C-allele in the lowest tertile of physical activity increased their prevalence of T2D, demonstrating a kinesio-genomic interaction. High-fat fed, male mice injected with MOTS-c showed reduced weight and improved glucose tolerance, but not K14Q-MOTS-c treated mice. Like the human data, female mice were unaffected. Mechanistically, K14Q-MOTS-c leads to diminished insulin-sensitization in vitro. Thus, the m.1382A>C polymorphism is associated with susceptibility to T2D in men, possibly interacting with exercise, and contributing to the risk of T2D in sedentary males by reducing the activity of MOTS-c.

  30. Genome-wide association studyによる日本人IgE値を規定する遺伝子群の同定

    志藤 光介, 小島 要, 山崎 研志, 木下 賢吾, 相場 節也

    日本皮膚免疫アレルギー学会総会学術大会プログラム・抄録集 50回 188-188 2020年12月

    出版者・発行元: (一社)日本皮膚免疫アレルギー学会

  31. 尿酸値の失われた遺伝率は、レアバリアントがかなりの部分を説明する

    三澤 計治, 長谷川 嵩矩, 三島 英換, Jutabha Promsuk, 大内 基司, 小島 要, 河合 洋介, 長崎 正朗, 安西 尚彦

    痛風と尿酸・核酸 44 (1) 86-86 2020年7月

    出版者・発行元: (一社)日本痛風・尿酸核酸学会

    eISSN:2435-0095

  32. 尿酸値の失われた遺伝率は、レアバリアントがかなりの部分を説明する

    三澤 計治, 長谷川 嵩矩, 三島 英換, Jutabha Promsuk, 大内 基司, 小島 要, 河合 洋介, 長崎 正朗, 安西 尚彦

    痛風と尿酸・核酸 44 (1) 86-86 2020年7月

    出版者・発行元: (一社)日本痛風・尿酸核酸学会

    eISSN:2435-0095

  33. Integrated GWAS and mRNA Microarray Analysis Identified IFNG and CD40L as the Central Upstream Regulators in Primary Biliary Cholangitis. 国際誌

    Kazuko Ueno, Yoshihiro Aiba, Yuki Hitomi, Shinji Shimoda, Hitomi Nakamura, Olivier Gervais, Yosuke Kawai, Minae Kawashima, Nao Nishida, Seik-Soon Kohn, Kaname Kojima, Shinji Katsushima, Atsushi Naganuma, Kazuhiro Sugi, Tatsuji Komatsu, Tomohiko Mannami, Kouki Matsushita, Kaname Yoshizawa, Fujio Makita, Toshiki Nikami, Hideo Nishimura, Hiroshi Kouno, Hirotaka Kouno, Hajime Ohta, Takuya Komura, Satoru Tsuruta, Kazuhiko Yamauchi, Tatsuro Kobata, Amane Kitasato, Tamotsu Kuroki, Seigo Abiru, Shinya Nagaoka, Atsumasa Komori, Hiroshi Yatsuhashi, Kiyoshi Migita, Hiromasa Ohira, Atsushi Tanaka, Hajime Takikawa, Masao Nagasaki, Katsushi Tokunaga, Minoru Nakamura

    Hepatology communications 4 (5) 724-738 2020年5月

    DOI: 10.1002/hep4.1497  

    詳細を見る 詳細を閉じる

    Genome-wide association studies (GWASs) in European and East Asian populations have identified more than 40 disease-susceptibility genes in primary biliary cholangitis (PBC). The aim of this study is to computationally identify disease pathways, upstream regulators, and therapeutic targets in PBC through integrated GWAS and messenger RNA (mRNA) microarray analysis. Disease pathways and upstream regulators were analyzed with ingenuity pathway analysis in data set 1 for GWASs (1,920 patients with PBC and 1,770 controls), which included 261 annotated genes derived from 6,760 single-nucleotide polymorphisms (P < 0.00001), and data set 2 for mRNA microarray analysis of liver biopsy specimens (36 patients with PBC and 5 normal controls), which included 1,574 genes with fold change >2 versus controls (P < 0.05). Hierarchical cluster analysis and categorization of cell type-specific genes were performed for data set 2. There were 27 genes, 10 pathways, and 149 upstream regulators that overlapped between data sets 1 and 2. All 10 pathways were immune-related. The most significant common upstream regulators associated with PBC disease susceptibility identified were interferon-gamma (IFNG) and CD40 ligand (CD40L). Hierarchical cluster analysis of data set 2 revealed two distinct groups of patients with PBC by disease activity. The most significant upstream regulators associated with disease activity were IFNG and CD40L. Several molecules expressed in B cells, T cells, Kupffer cells, and natural killer-like cells were identified as potential therapeutic targets in PBC with reference to a recently reported list of cell type-specific gene expression in the liver. Conclusion: Our integrated analysis using GWAS and mRNA microarray data sets predicted that IFNG and CD40L are the central upstream regulators in both disease susceptibility and activity of PBC and identified potential downstream therapeutic targets.

  34. Contribution of Rare Variants of the SLC22A12 Gene to the Missing Heritability of Serum Urate Levels. 国際誌

    Kazuharu Misawa, Takanori Hasegawa, Eikan Mishima, Promsuk Jutabha, Motoshi Ouchi, Kaname Kojima, Yosuke Kawai, Masafumi Matsuo, Naohiko Anzai, Masao Nagasaki

    Genetics 214 (4) 1079-1090 2020年4月

    DOI: 10.1534/genetics.119.303006  

    詳細を見る 詳細を閉じる

    Gout is a common arthritis caused by monosodium urate crystals. The heritability of serum urate levels is estimated to be 30-70%; however, common genetic variants account for only 7.9% of the variance in serum urate levels. This discrepancy is an example of "missing heritability." The "missing heritability" suggests that variants associated with uric acid levels are yet to be found. By using genomic sequences of the ToMMo cohort, we identified rare variants of the SLC22A12 gene that affect the urate transport activity of URAT1. URAT1 is a transporter protein encoded by the SLC22A12 gene. We grouped the participants with variants affecting urate uptake by URAT1 and analyzed the variance of serum urate levels. The results showed that the heritability explained by the SLC22A12 variants of men and women exceeds 10%, suggesting that rare variants underlie a substantial portion of the "missing heritability" of serum urate levels.

  35. 尿酸値の失われた遺伝率は、レアバリアントがかなりの部分を説明する

    三澤 計治, 長谷川 嵩矩, 三島 英換, Jutabha Promsuk, 大内 基司, 小島 要, 河合 洋介, 長崎 正朗, 安西 尚彦

    日本痛風・核酸代謝学会総会プログラム抄録集 53回 66-66 2020年1月

    出版者・発行元: (一社)日本痛風・尿酸核酸学会

  36. Genome-wide association meta-analysis and Mendelian randomization analysis confirm the influence of ALDH2 on sleep durationin the Japanese population. 国際誌

    Takeshi Nishiyama, Masahiro Nakatochi, Atsushi Goto, Motoki Iwasaki, Tsuyoshi Hachiya, Yoichi Sutoh, Atsushi Shimizu, Chaochen Wang, Hideo Tanaka, Miki Watanabe, Akihiro Hosono, Yuya Tamai, Tamaki Yamada, Taiki Yamaji, Norie Sawada, Kentaro Fukumoto, Kotaro Otsuka, Kozo Tanno, Hiroaki Tomita, Kaname Kojima, Masao Nagasaki, Atsushi Hozawa, Asahi Hishida, Tae Sasakabe, Yuichiro Nishida, Megumi Hara, Hidemi Ito, Isao Oze, Yohko Nakamura, Haruo Mikami, Rie Ibusuki, Toshiro Takezaki, Teruhide Koyama, Nagato Kuriyama, Kaori Endoh, Kiyonori Kuriki, Tanvir C Turin, Takashima Naoyuki, Sakurako Katsuura-Kamano, Hirokazu Uemura, Rieko Okada, Sayo Kawai, Mariko Naito, Yukihide Momozawa, Michiaki Kubo, Makoto Sasaki, Masayuki Yamamoto, Shoichiro Tsugane, Kenji Wakai, Sadao Suzuki

    Sleep 42 (6) 2019年6月11日

    DOI: 10.1093/sleep/zsz046  

    詳細を見る 詳細を閉じる

    Usual sleep duration has substantial heritability and is associated with various physical and psychiatric conditions as well as mortality. However, for its genetic locus, only PAX8 and VRK2 have been replicated in previous genome-wide association studies (GWAS). We conducted a GWAS meta-analysis of self-reported usual sleep duration using three population-based cohorts totaling 31 230 Japanese individuals. A genome-wide significant locus was identified at 12q24 (p-value < 5.0 × 10-8). Subsequently, a functional variant in the ALDH2 locus, rs671, was replicated in an independent sample of 5140 Japanese individuals (p-value = 0.004). The association signal, however, disappeared after adjusting for alcohol consumption, indicating the possibility that the rs671 genotype modifies sleep duration via alcohol consumption. This hypothesis explained a modest genetic correlation observed between sleep duration and alcohol consumption (rG = 0.23). A Mendelian randomization analysis using rs671 and other variants as instrumental variables confirmed this by showing a causal effect of alcohol consumption, but not of coffee consumption on sleep duration. Another genome-wide significant locus was identified at 5q33 after adjusting for drinking frequency. However, this locus was not replicated, nor was the PAX8 and VRK2. Our study has confirmed that a functional ALDH2 variant, rs671, most strongly influences on usual sleep duration possibly via alcohol consumption in the Japanese population, and presumably in East Asian populations. This highlights the importance of considering the involvement of alcohol consumption in future GWAS of usual sleep duration, even in non-East Asian populations, where rs671 is monomorphic.

  37. Estimating carrier frequencies of newborn screening disorders using a whole-genome reference panel of 3552 Japanese individuals. 国際誌

    Yumi Yamaguchi-Kabata, Jun Yasuda, Akira Uruno, Kazuro Shimokawa, Seizo Koshiba, Yoichi Suzuki, Nobuo Fuse, Hiroshi Kawame, Shu Tadaka, Masao Nagasaki, Kaname Kojima, Fumiki Katsuoka, Kazuki Kumada, Osamu Tanabe, Gen Tamiya, Nobuo Yaegashi, Kengo Kinoshita, Masayuki Yamamoto, Shigeo Kure

    Human genetics 138 (4) 389-409 2019年4月

    DOI: 10.1007/s00439-019-01998-7  

    詳細を見る 詳細を閉じる

    Incidence rates of Mendelian diseases vary among ethnic groups, and frequencies of variant types of causative genes also vary among human populations. In this study, we examined to what extent we can predict population frequencies of recessive disorders from genomic data, and explored better strategies for variant interpretation and classification. We used a whole-genome reference panel from 3552 general Japanese individuals constructed by the Tohoku Medical Megabank Organization (ToMMo). Focusing on 32 genes for 17 congenital metabolic disorders included in newborn screening (NBS) in Japan, we identified reported and predicted pathogenic variants through variant annotation, interpretation, and multiple ways of classifications. The estimated carrier frequencies were compared with those from the Japanese NBS data based on 1,949,987 newborns from a previous study. The estimated carrier frequency based on genomic data with a recent guideline of variant interpretation for the PAH gene, in which defects cause hyperphenylalaninemia (HPA) and phenylketonuria (PKU), provided a closer estimate to that by the observed incidence than the other methods. In contrast, the estimated carrier frequencies for SLC25A13, which causes citrin deficiency, were much higher compared with the incidence rate. The results varied greatly among the 11 NBS diseases with single responsible genes; the possible reasons for departures from the carrier frequencies by reported incidence rates were discussed. Of note, (1) the number of pathogenic variants increases by including additional lines of evidence, (2) common variants with mild effects also contribute to the actual frequency of patients, and (3) penetrance of each variant remains unclear.

  38. Genome analyses for the Tohoku Medical Megabank Project towards establishment of personalized healthcare. 国際誌

    Jun Yasuda, Kengo Kinoshita, Fumiki Katsuoka, Inaho Danjoh, Mika Sakurai-Yageta, Ikuko N Motoike, Yoko Kuroki, Sakae Saito, Kaname Kojima, Matsuyuki Shirota, Daisuke Saigusa, Akihito Otsuki, Junko Kawashima, Yumi Yamaguchi-Kabata, Shu Tadaka, Yuichi Aoki, Takahiro Mimori, Kazuki Kumada, Jin Inoue, Satoshi Makino, Miho Kuriki, Nobuo Fuse, Seizo Koshiba, Osamu Tanabe, Masao Nagasaki, Gen Tamiya, Ritsuko Shimizu, Takako Takai-Igarashi, Soichi Ogishima, Atsushi Hozawa, Shinichi Kuriyama, Junichi Sugawara, Akito Tsuboi, Hideyasu Kiyomoto, Tadashi Ishii, Hiroaki Tomita, Naoko Minegishi, Yoichi Suzuki, Kichiya Suzuki, Hiroshi Kawame, Hiroshi Tanaka, Yasuyuki Taki, Nobuo Yaegashi, Shigeo Kure, Fuji Nagami, Kenjiro Kosaki, Yoichi Sutoh, Tsuyoshi Hachiya, Atsushi Shimizu, Makoto Sasaki, Masayuki Yamamoto

    Journal of biochemistry 165 (2) 139-158 2019年2月1日

    DOI: 10.1093/jb/mvy096  

    詳細を見る 詳細を閉じる

    Personalized healthcare (PHC) based on an individual's genetic make-up is one of the most advanced, yet feasible, forms of medical care. The Tohoku Medical Megabank (TMM) Project aims to combine population genomics, medical genetics and prospective cohort studies to develop a critical infrastructure for the establishment of PHC. To date, a TMM CommCohort (adult general population) and a TMM BirThree Cohort (birth+three-generation families) have conducted recruitments and baseline surveys. Genome analyses as part of the TMM Project will aid in the development of a high-fidelity whole-genome Japanese reference panel, in designing custom single-nucleotide polymorphism (SNP) arrays specific to Japanese, and in estimation of the biological significance of genetic variations through linked investigations of the cohorts. Whole-genome sequencing from >3,500 unrelated Japanese and establishment of a Japanese reference genome sequence from long-read data have been done. We next aim to obtain genotype data for all TMM cohort participants (>150,000) using our custom SNP arrays. These data will help identify disease-associated genomic signatures in the Japanese population, while genomic data from TMM BirThree Cohort participants will be used to improve the reference genome panel. Follow-up of the cohort participants will allow us to test the genetic markers and, consequently, contribute to the realization of PHC.

  39. POGLUT1, the putative effector gene driven by rs2293370 in primary biliary cholangitis susceptibility locus chromosome 3q13.33. 国際誌

    Yuki Hitomi, Kazuko Ueno, Yosuke Kawai, Nao Nishida, Kaname Kojima, Minae Kawashima, Yoshihiro Aiba, Hitomi Nakamura, Hiroshi Kouno, Hirotaka Kouno, Hajime Ohta, Kazuhiro Sugi, Toshiki Nikami, Tsutomu Yamashita, Shinji Katsushima, Toshiki Komeda, Keisuke Ario, Atsushi Naganuma, Masaaki Shimada, Noboru Hirashima, Kaname Yoshizawa, Fujio Makita, Kiyoshi Furuta, Masahiro Kikuchi, Noriaki Naeshiro, Hironao Takahashi, Yutaka Mano, Haruhiro Yamashita, Kouki Matsushita, Seiji Tsunematsu, Iwao Yabuuchi, Hideo Nishimura, Yusuke Shimada, Kazuhiko Yamauchi, Tatsuji Komatsu, Rie Sugimoto, Hironori Sakai, Eiji Mita, Masaharu Koda, Yoko Nakamura, Hiroshi Kamitsukasa, Takeaki Sato, Makoto Nakamuta, Naohiko Masaki, Hajime Takikawa, Atsushi Tanaka, Hiromasa Ohira, Mikio Zeniya, Masanori Abe, Shuichi Kaneko, Masao Honda, Kuniaki Arai, Teruko Arinaga-Hino, Etsuko Hashimoto, Makiko Taniai, Takeji Umemura, Satoru Joshita, Kazuhiko Nakao, Tatsuki Ichikawa, Hidetaka Shibata, Akinobu Takaki, Satoshi Yamagiwa, Masataka Seike, Shotaro Sakisaka, Yasuaki Takeyama, Masaru Harada, Michio Senju, Osamu Yokosuka, Tatsuo Kanda, Yoshiyuki Ueno, Hirotoshi Ebinuma, Takashi Himoto, Kazumoto Murata, Shinji Shimoda, Shinya Nagaoka, Seigo Abiru, Atsumasa Komori, Kiyoshi Migita, Masahiro Ito, Hiroshi Yatsuhashi, Yoshihiko Maehara, Shinji Uemoto, Norihiro Kokudo, Masao Nagasaki, Katsushi Tokunaga, Minoru Nakamura

    Scientific reports 9 (1) 102-102 2019年1月14日

    DOI: 10.1038/s41598-018-36490-1  

    詳細を見る 詳細を閉じる

    Primary biliary cholangitis (PBC) is a chronic and cholestatic autoimmune liver disease caused by the destruction of intrahepatic small bile ducts. Our previous genome-wide association study (GWAS) identified six susceptibility loci for PBC. Here, in order to further elucidate the genetic architecture of PBC, a GWAS was performed on an additional independent sample set, then a genome-wide meta-analysis with our previous GWAS was performed based on a whole-genome single nucleotide polymorphism (SNP) imputation analysis of a total of 4,045 Japanese individuals (2,060 cases and 1,985 healthy controls). A susceptibility locus on chromosome 3q13.33 (including ARHGAP31, TMEM39A, POGLUT1, TIMMDC1, and CD80) was previously identified both in the European and Chinese populations and was replicated in the Japanese population (OR = 0.7241, P = 3.5 × 10-9). Subsequent in silico and in vitro functional analyses identified rs2293370, previously reported as the top-hit SNP in this locus in the European population, as the primary functional SNP. Moreover, e-QTL analysis indicated that the effector gene of rs2293370 was Protein O-Glucosyltransferase 1 (POGLUT1) (P = 3.4 × 10-8). This is the first study to demonstrate that POGLUT1 and not CD80 is the effector gene regulated by the primary functional SNP rs2293370, and that increased expression of POGLUT1 might be involved in the pathogenesis of PBC.

  40. NFKB1 and MANBA Confer Disease Susceptibility to Primary Biliary Cholangitis via Independent Putative Primary Functional Variants. 国際誌

    Yuki Hitomi, Ken Nakatani, Kaname Kojima, Nao Nishida, Yosuke Kawai, Minae Kawashima, Yoshihiro Aiba, Masao Nagasaki, Minoru Nakamura, Katsushi Tokunaga

    Cellular and molecular gastroenterology and hepatology 7 (3) 515-532 2019年

    DOI: 10.1016/j.jcmgh.2018.11.006  

    詳細を見る 詳細を閉じる

    BACKGROUND & AIMS: Primary biliary cholangitis (PBC) is a chronic and cholestatic liver disease that eventually leads to cirrhosis and hepatic failure. We recently identified several susceptibility genes included NFKB1 and MANBA for PBC in the Japanese population by genome-wide association study. However, the primary functional variants in the NFKB1/MANBA region and the molecular mechanism for conferring disease susceptibility to PBC have not yet been clarified. METHODS: We performed high-density association mapping based on a single-nucleotide polymorphism (SNP) imputation analysis, using data from a whole-genome sequence reference panel of 1070 Japanese individuals and the previous genome-wide association study (1389 PBC patients, 1508 healthy controls). Among SNPs (P < 5.0 × 10-7) in the NFKB1/MANBA region, putative primary functional variants and the molecular mechanism for conferring disease susceptibility to PBC were identified by in silico/in vitro functional analysis. RESULTS: Among the SNPs in the NFKB1/MANBA region, rs17032850 and rs227361, which changed the binding of transcription factors lymphoid enhancer-binding factor 1 (LEF-1) and retinoid X receptor α (RXRα), respectively, were identified as putative primary functional variants that regulate gene expression. In addition, expression-quantitative trait locus data and gene editing using a clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 system supported the potential role of rs17032850 and rs227361 in regulating NFKB1 and MANBA expression, respectively. CONCLUSIONS: We identified independent putative primary functional variants in NFKB1/MANBA and showed the distinct molecular mechanism by which each putative primary functional variant conferred susceptibility to PBC. Our approach was useful to dissect the pathogenesis not only of PBC, but also other digestive diseases in which NFKB1/MANBA has been reported as a susceptibility locus.

  41. Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing. 国際誌

    Masao Nagasaki, Yoko Kuroki, Tomoko F Shibata, Fumiki Katsuoka, Takahiro Mimori, Yosuke Kawai, Naoko Minegishi, Atsushi Hozawa, Shinichi Kuriyama, Yoichi Suzuki, Hiroshi Kawame, Fuji Nagami, Takako Takai-Igarashi, Soichi Ogishima, Kaname Kojima, Kazuharu Misawa, Osamu Tanabe, Nobuo Fuse, Hiroshi Tanaka, Nobuo Yaegashi, Kengo Kinoshita, Shiego Kure, Jun Yasuda, Masayuki Yamamoto

    Human genome variation 6 27-27 2019年

    DOI: 10.1038/s41439-019-0057-7  

    詳細を見る 詳細を閉じる

    In recent genome analyses, population-specific reference panels have indicated important. However, reference panels based on short-read sequencing data do not sufficiently cover long insertions. Therefore, the nature of long insertions has not been well documented. Here, we assembled a Japanese genome using single-molecule real-time sequencing data and characterized insertions found in the assembled genome. We identified 3691 insertions ranging from 100 bps to ~10,000 bps in the assembled genome relative to the international reference sequence (GRCh38). To validate and characterize these insertions, we mapped short-reads from 1070 Japanese individuals and 728 individuals from eight other populations to insertions integrated into GRCh38. With this result, we constructed JRGv1 (Japanese Reference Genome version 1) by integrating the 903 verified insertions, totaling 1,086,173 bases, shared by at least two Japanese individuals into GRCh38. We also constructed decoyJRGv1 by concatenating 3559 verified insertions, totaling 2,536,870 bases, shared by at least two Japanese individuals or by six other assemblies. This assembly improved the alignment ratio by 0.4% on average. These results demonstrate the importance of refining the reference assembly and creating a population-specific reference genome. JRGv1 and decoyJRGv1 are available at the JRG website.

  42. 3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome. 国際誌

    Shu Tadaka, Fumiki Katsuoka, Masao Ueki, Kaname Kojima, Satoshi Makino, Sakae Saito, Akihito Otsuki, Chinatsu Gocho, Mika Sakurai-Yageta, Inaho Danjoh, Ikuko N Motoike, Yumi Yamaguchi-Kabata, Matsuyuki Shirota, Seizo Koshiba, Masao Nagasaki, Naoko Minegishi, Atsushi Hozawa, Shinichi Kuriyama, Atsushi Shimizu, Jun Yasuda, Nobuo Fuse, Gen Tamiya, Masayuki Yamamoto, Kengo Kinoshita

    Human genome variation 6 28-28 2019年

    DOI: 10.1038/s41439-019-0059-5  

    詳細を見る 詳細を閉じる

    The first step towards realizing personalized healthcare is to catalog the genetic variations in a population. Since the dissemination of individual-level genomic information is strictly controlled, it will be useful to construct population-level allele frequency panels with easy-to-use interfaces. In the Tohoku Medical Megabank Project, we sequenced nearly 4000 individuals from a Japanese population and constructed an allele frequency panel of 3552 individuals after removing related samples. The panel is called the 3.5KJPNv2. It was constructed by using a standard pipeline including the 1KGP and gnomAD algorithms to reduce technical biases and to allow comparisons to other populations. Our database is the first large-scale panel providing the frequencies of variants present on the X chromosome and on the mitochondria in the Japanese population. All the data are available on our original database at https://jmorp.megabank.tohoku.ac.jp.

  43. Time-series filtering for replicated observations via a kernel approximate Bayesian computation 査読有り

    Takanori Hasegawa, Kaname Kojima, Yosuke Kawai, Masao Nagasaki

    IEEE Transactions on Signal Processing 66 (23) 6148-6161 2018年12月1日

    出版者・発行元: Institute of Electrical and Electronics Engineers (IEEE)

    DOI: 10.1109/tsp.2018.2872864  

    ISSN:1053-587X

    eISSN:1941-0476

  44. Strong Association of the HLA-DR/DQ Locus with Childhood Steroid-Sensitive Nephrotic Syndrome in the Japanese Population. 国際誌

    Xiaoyuan Jia, Tomoko Horinouchi, Yuki Hitomi, Akemi Shono, Seik-Soon Khor, Yosuke Omae, Kaname Kojima, Yosuke Kawai, Masao Nagasaki, Yoshitsugu Kaku, Takayuki Okamoto, Yoko Ohwada, Kazuhide Ohta, Yusuke Okuda, Rika Fujimaru, Ken Hatae, Naonori Kumagai, Emi Sawanobori, Hitoshi Nakazato, Yasufumi Ohtsuka, Koichi Nakanishi, Yuko Shima, Ryojiro Tanaka, Akira Ashida, Koichi Kamei, Kenji Ishikura, Kandai Nozu, Katsushi Tokunaga, Kazumoto Iijima

    Journal of the American Society of Nephrology : JASN 29 (8) 2189-2199 2018年8月

    DOI: 10.1681/ASN.2017080859  

    詳細を見る 詳細を閉じる

    Background Nephrotic syndrome is the most common cause of chronic glomerular disease in children. Most of these patients develop steroid-sensitive nephrotic syndrome (SSNS), but the loci conferring susceptibility to childhood SSNS are mainly unknown.Methods We conducted a genome-wide association study (GWAS) in the Japanese population; 224 patients with childhood SSNS and 419 adult healthy controls were genotyped using the Affymetrix Japonica Array in the discovery stage. Imputation for six HLA genes (HLA-A, -C, -B, -DRB1, -DQB1, and -DPB1) was conducted on the basis of Japanese-specific references. We performed genotyping for HLA-DRB1/-DQB1 using a sequence-specific oligonucleotide-probing method on a Luminex platform. Whole-genome imputation was conducted using a phased reference panel of 2049 healthy Japanese individuals. Replication was performed in an independent Japanese sample set including 216 patients and 719 healthy controls. We genotyped candidate single-nucleotide polymorphisms using the DigiTag2 assay.Results The most significant association was detected in the HLA-DR/DQ region and replicated (rs4642516 [minor allele G], combined Pallelic=7.84×10-23; odds ratio [OR], 0.33; 95% confidence interval [95% CI], 0.26 to 0.41; rs3134996 [minor allele A], combined Pallelic=1.72×10-25; OR, 0.29; 95% CI, 0.23 to 0.37). HLA-DRB1*08:02 (Pc=1.82×10-9; OR, 2.62; 95% CI, 1.94 to 3.54) and HLA-DQB1*06:04 (Pc=2.09×10-12; OR, 0.10; 95% CI, 0.05 to 0.21) were considered primary HLA alleles associated with childhood SSNS. HLA-DRB1*08:02-DQB1*03:02 (Pc=7.01×10-11; OR, 3.60; 95% CI, 2.46 to 5.29) was identified as the most significant genetic susceptibility factor.Conclusions The most significant association with childhood SSNS was detected in the HLA-DR/DQ region. Further HLA allele/haplotype analyses should enhance our understanding of molecular mechanisms underlying SSNS.

  45. Regional genetic differences among Japanese populations and performance of genotype imputation using whole-genome reference panel of the Tohoku Medical Megabank Project. 国際誌

    Jun Yasuda, Fumiki Katsuoka, Inaho Danjoh, Yosuke Kawai, Kaname Kojima, Masao Nagasaki, Sakae Saito, Yumi Yamaguchi-Kabata, Shu Tadaka, Ikuko N Motoike, Kazuki Kumada, Mika Sakurai-Yageta, Osamu Tanabe, Nobuo Fuse, Gen Tamiya, Koichiro Higasa, Fumihiko Matsuda, Nobufumi Yasuda, Motoki Iwasaki, Makoto Sasaki, Atsushi Shimizu, Kengo Kinoshita, Masayuki Yamamoto

    BMC genomics 19 (1) 551-551 2018年7月24日

    DOI: 10.1186/s12864-018-4942-0  

    詳細を見る 詳細を閉じる

    BACKGROUND: Genotype imputation from single-nucleotide polymorphism (SNP) genotype data using a haplotype reference panel consisting of thousands of unrelated individuals from populations of interest can help to identify strongly associated variants in genome-wide association studies. The Tohoku Medical Megabank (TMM) project was established to support the development of precision medicine, together with the whole-genome sequencing of 1070 human genomes from individuals in the Miyagi region (Northeast Japan) and the construction of the 1070 Japanese genome reference panel (1KJPN). Here, we investigated the performance of 1KJPN for genotype imputation of Japanese samples not included in the TMM project and compared it with other population reference panels. RESULTS: We found that the 1KJPN population was more similar to other Japanese populations, Nagahama (south-central Japan) and Aki (Shikoku Island), than to East Asian populations in the 1000 Genomes Project other than JPT, suggesting that the large-scale collection (more than 1000) of Japanese genomes from the Miyagi region covered many of the genetic variations of Japanese in mainland Japan. Moreover, 1KJPN outperformed the phase 3 reference panel of the 1000 Genomes Project (1KGPp3) for Japanese samples, and IKJPN showed similar imputation rates for the TMM and other Japanese samples for SNPs with minor allele frequencies (MAFs) higher than 1%. CONCLUSIONS: 1KJPN covered most of the variants found in the samples from areas of the Japanese mainland outside the Miyagi region, implying 1KJPN is representative of the Japanese population's genomes. 1KJPN and successive reference panels are useful genome reference panels for the mainland Japanese population. Importantly, the addition of whole genome sequences not included in the 1KJPN panel improved imputation efficiencies for SNPs with MAFs under 1% for samples from most regions of the Japanese archipelago.

  46. NELFCD and CTSZ loci are associated with jaundice-stage progression in primary biliary cholangitis in the Japanese population. 国際誌

    Nao Nishida, Yoshihiro Aiba, Yuki Hitomi, Minae Kawashima, Kaname Kojima, Yosuke Kawai, Kazuko Ueno, Hitomi Nakamura, Noriyo Yamashiki, Tomohiro Tanaka, Sumito Tamura, Akira Mori, Shintaro Yagi, Yuji Soejima, Tomoharu Yoshizumi, Mitsuhisa Takatsuki, Atsushi Tanaka, Kenichi Harada, Shinji Shimoda, Atsumasa Komori, Susumu Eguchi, Yoshihiko Maehara, Shinji Uemoto, Norihiro Kokudo, Masao Nagasaki, Katsushi Tokunaga, Minoru Nakamura

    Scientific reports 8 (1) 8071-8071 2018年5月23日

    DOI: 10.1038/s41598-018-26369-6  

    詳細を見る 詳細を閉じる

    Approximately 10-20% of patients with primary biliary cholangitis (PBC) progress to jaundice stage regardless of treatment with ursodeoxycholic acid and bezafibrate. In this study, we performed a GWAS and a replication study to identify genetic variants associated with jaundice-stage progression in PBC using a total of 1,375 patients (1,202 early-stage and 173 jaundice-stage) in a Japanese population. SNP rs13720, which is located in the 3'UTR of cathepsin Z (CTSZ), showed the strongest association (odds ratio [OR] = 2.15, P = 7.62 × 10-7) with progression to jaundice stage in GWAS. High-density association mapping at the CTSZ and negative elongation factor complex member C/D (NELFCD) loci, which are located within a strong linkage disequilibrium (LD) block, revealed that an intronic SNP of CTSZ, rs163800, was significantly associated with jaundice-stage progression (OR = 2.16, P = 8.57 × 10-8). In addition, eQTL analysis and in silico functional analysis indicated that genotypes of rs163800 or variants in strong LD with rs163800 influence expression levels of both NELFCD and CTSZ mRNA. The present novel findings will contribute to dissect the mechanism of PBC progression and also to facilitate the development of therapies for PBC patients who are resistant to current therapies.

  47. Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population. 国際誌

    Satoshi Hirata, Kaname Kojima, Kazuharu Misawa, Olivier Gervais, Yosuke Kawai, Masao Nagasaki

    Heliyon 4 (5) e00625 2018年5月

    DOI: 10.1016/j.heliyon.2018.e00625  

    詳細を見る 詳細を閉じる

    Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.

  48. Identification of somatic genetic alterations in ovarian clear cell carcinoma with next generation sequencing. 国際誌

    Yusuke Shibuya, Hideki Tokunaga, Sakae Saito, Kazurou Shimokawa, Fumiki Katsuoka, Li Bin, Kaname Kojima, Masao Nagasaki, Masayuki Yamamoto, Nobuo Yaegashi, Jun Yasuda

    Genes, chromosomes & cancer 57 (2) 51-60 2018年2月

    DOI: 10.1002/gcc.22507  

    詳細を見る 詳細を閉じる

    Ovarian clear cell carcinoma (OCCC) is the most refractory subtype of ovarian cancer and more prevalent in Japanese than Caucasians (25% and 5% of all ovarian cancer, respectively). The aim of this study is to discover the genomic alterations that may cause OCCC and effective molecular targets for chemotherapy. Paired genomic DNAs of 48 OCCC tissues and corresponding noncancerous tissues were extracted from formalin-fixed, paraffin embedded specimens collected between 2007 and 2015 at Tohoku University Hospital. All specimens underwent exome sequencing and the somatic genetic alterations were identified. We divided the cases into three clusters based on the mutation spectra. Clinical characteristics such as age of onset and endometriosis are similar among the clusters but one cluster shows mutations related to APOBEC activation, indicating its contribution to subset of OCCC cases. There are three hypermutated cases (showing 12-fold or higher somatic mutations than the other 45 cases) and they have germline and somatic mismatch repair gene alterations. The frequently mutated genes are ARID1A (66.7%), PIK3CA (50%), PPP2R1A (18.8%), and KRAS (16.7%). Somatic mutations important for selection of chemotherapeutic agents, such as BRAF, ERBB2, PDGFRB, PGR, and KRAS are found in 27.1% of OCCC cases, indicating clinical importance of exome analysis for OCCC. Our study suggests that the genetic instability caused by either mismatch repair defect or activation of APOBEC play critical roles in OCCC carcinogenesis.

  49. Evaluation of reported pathogenic variants and their frequencies in a Japanese population based on a whole-genome reference panel of 2049 individuals. 国際誌

    Yumi Yamaguchi-Kabata, Jun Yasuda, Osamu Tanabe, Yoichi Suzuki, Hiroshi Kawame, Nobuo Fuse, Masao Nagasaki, Yosuke Kawai, Kaname Kojima, Fumiki Katsuoka, Sakae Saito, Inaho Danjoh, Ikuko N Motoike, Riu Yamashita, Seizo Koshiba, Daisuke Saigusa, Gen Tamiya, Shigeo Kure, Nobuo Yaegashi, Yoshio Kawaguchi, Fuji Nagami, Shinichi Kuriyama, Junichi Sugawara, Naoko Minegishi, Atsushi Hozawa, Soichi Ogishima, Hideyasu Kiyomoto, Takako Takai-Igarashi, Kengo Kinoshita, Masayuki Yamamoto

    Journal of human genetics 63 (2) 213-230 2018年2月

    DOI: 10.1038/s10038-017-0347-1  

    詳細を見る 詳細を閉じる

    Clarifying allele frequencies of disease-related genetic variants in a population is important in genomic medicine; however, such data is not yet available for the Japanese population. To estimate frequencies of actionable pathogenic variants in the Japanese population, we examined the reported pathological variants in genes recommended by the American College of Medical Genetics and Genomics (ACMG) in our reference panel of genomic variations, 2KJPN, which was created by whole-genome sequencing of 2049 individuals of the resident cohort of the Tohoku Medical Megabank Project. We searched for pathogenic variants in 2KJPN for 57 autosomal ACMG-recommended genes responsible for 26 diseases and then examined their frequencies. By referring to public databases of pathogenic variations, we identified 143 reported pathogenic variants in 2KJPN for the 57 ACMG recommended genes based on a classification system. At the individual level, 21% of the individuals were found to have at least one reported pathogenic allele. We then conducted a literature survey to review the variants and to check for evidence of pathogenicity. Our results suggest that a substantial number of people have reported pathogenic alleles for the ACMG genes, and reviewing variants is indispensable for constructing the information infrastructure of genomic medicine for the Japanese population.

  50. Identification of the functional variant driving ORMDL3 and GSDMB expression in human chromosome 17q12-21 in primary biliary cholangitis. 国際誌

    Yuki Hitomi, Kaname Kojima, Minae Kawashima, Yosuke Kawai, Nao Nishida, Yoshihiro Aiba, Michio Yasunami, Masao Nagasaki, Minoru Nakamura, Katsushi Tokunaga

    Scientific reports 7 (1) 2904-2904 2017年6月6日

    DOI: 10.1038/s41598-017-03067-3  

    詳細を見る 詳細を閉じる

    Numerous genome-wide association studies (GWAS) have been performed to identify susceptibility genes to various human complex diseases. However, in many cases, neither a functional variant nor a disease susceptibility gene have been clarified. Here, we show an efficient approach for identification of a functional variant in a primary biliary cholangitis (PBC)-susceptible region, chromosome 17q12-21 (ORMDL3-GSDMB-ZPBP2-IKZF3). High-density association mapping was carried out based on SNP imputation analysis by using the whole-genome sequence data from a reference panel of 1,070 Japanese individuals (1KJPN), together with genotype data from our previous GWAS (PBC patients: n = 1,389; healthy controls: n = 1,508). Among 23 single nucleotide polymorphisms (SNPs) with P < 1.0 × 10-8, rs12946510 was identified as the functional variant that influences gene expression via alteration of Forkhead box protein O1 (FOXO1) binding affinity in vitro. Moreover, expression-quantitative trait locus (e-QTL) analyses showed that the PBC susceptibility allele of rs12946510 was significantly associated with lower endogenous expression of ORMDL3 and GSDMB in whole blood and spleen. This study not only identified the functional variant in chr.17q12-21 and its molecular mechanism through which it conferred susceptibility to PBC, but it also illustrated an efficient systematic approach for post-GWAS analysis that is applicable to other complex diseases.

  51. Genome-Wide Association Study Identifies TLL1 Variant Associated With Development of Hepatocellular Carcinoma After Eradication of Hepatitis C Virus Infection. 国際誌

    Kentaro Matsuura, Hiromi Sawai, Kazuho Ikeo, Shintaro Ogawa, Etsuko Iio, Masanori Isogawa, Noritomo Shimada, Atsumasa Komori, Hidenori Toyoda, Takashi Kumada, Tadashi Namisaki, Hitoshi Yoshiji, Naoya Sakamoto, Mina Nakagawa, Yasuhiro Asahina, Masayuki Kurosaki, Namiki Izumi, Nobuyuki Enomoto, Atsunori Kusakabe, Eiji Kajiwara, Yoshito Itoh, Tatsuya Ide, Akihiro Tamori, Misako Matsubara, Norifumi Kawada, Ken Shirabe, Eiichi Tomita, Masao Honda, Shuichi Kaneko, Sohji Nishina, Atsushi Suetsugu, Yoichi Hiasa, Hisayoshi Watanabe, Takuya Genda, Isao Sakaida, Shuhei Nishiguchi, Koichi Takaguchi, Eiji Tanaka, Junichi Sugihara, Mitsuo Shimada, Yasuteru Kondo, Yosuke Kawai, Kaname Kojima, Masao Nagasaki, Katsushi Tokunaga, Yasuhito Tanaka

    Gastroenterology 152 (6) 1383-1394 2017年5月

    DOI: 10.1053/j.gastro.2017.01.041  

    詳細を見る 詳細を閉じる

    BACKGROUND & AIMS: There is still a risk for hepatocellular carcinoma (HCC) development after eradication of hepatitis C virus (HCV) infection with antiviral agents. We investigated genetic factors associated with the development of HCC in patients with a sustained virologic response (SVR) to treatment for chronic HCV infection. METHODS: We obtained genomic DNA from 457 patients in Japan with a SVR to interferon-based treatment for chronic HCV infection from 2007 through 2015. We conducted a genome-wide association study (GWAS), followed by a replication analysis of 79 candidate single nucleotide polymorphisms (SNPs) in an independent set of 486 patients in Japan. The study end point was HCC diagnosis or confirmation of lack of HCC (at follow-up examinations until December 2014 in the GWAS cohort, and until January 2016 in the replication cohort). We collected clinical and laboratory data from all patients. We analyzed expression levels of candidate gene variants in human hepatic stellate cells, rats with steatohepatitis caused by a choline-deficient L-amino acid-defined diet, and a mouse model of liver injury caused by administration of carbon tetrachloride. We also analyzed expression levels in liver tissues of patients with chronic HCV infection with different stages of fibrosis or tumors vs patients without HCV infection (controls). RESULTS: We found a strong association between the SNP rs17047200, located within the intron of the tolloid like 1 gene (TLL1) on chromosome 4, and development of HCC; there was a genome-wide level of significance when the results of the GWAS and replication study were combined (odds ratio, 2.37; P = 2.66 × 10-8). Multivariate analysis showed rs17047200 AT/TT to be an independent risk factor for HCC (hazard ratio, 1.78; P = .008), along with male sex, older age, lower level of albumin, advanced stage of hepatic fibrosis, presence of diabetes, and higher post-treatment level of α-fetoprotein. Combining the rs17047200 genotype with other factors, we developed prediction models for HCC development in patients with mild or advanced hepatic fibrosis. Levels of TLL1 messenger RNA (mRNA) in human hepatic stellate cells increased with activation. Levels of Tll1 mRNA increased in liver tissues of rodents with hepatic fibrogenesis compared with controls. Levels of TLL1 mRNA increased in liver tissues of patients with progression of fibrosis. Gene expression levels of TLL1 short variants, including isoform 2, were higher in patients with rs17047200 AT/TT. CONCLUSIONS: In a GWAS, we identified the association between the SNP rs17047200, within the intron of TLL1, and development of HCC in patients who achieved an SVR to treatment for chronic HCV infection. We found levels of Tll1/TLL1 mRNA to be increased in rodent models of liver injury and liver tissues of patients with fibrosis, compared with controls. We propose that this SNP might affect splicing of TLL1 mRNA, yielding short variants with high catalytic activity that accelerates hepatic fibrogenesis and carcinogenesis. Further studies are needed to determine how rs17047200 affects TLL1 mRNA levels, splicing, and translation, as well as the prevalence of this variant among other patients with HCC. Tests for the TLL1 SNP might be used to identify patients at risk for HCC after an SVR to treatment of HCV infection.

  52. Genome-wide association study using the ethnicity-specific Japonica array: identification of new susceptibility loci for cold medicine-related Stevens-Johnson syndrome with severe ocular complications. 国際誌

    Mayumi Ueta, Hiromi Sawai, Ryosei Shingaki, Yusuke Kawai, Chie Sotozono, Kaname Kojima, Kyung-Chul Yoon, Mee Kum Kim, Kyoung Yul Seo, Choun-Ki Joo, Masao Nagasaki, Shigeru Kinoshita, Katsushi Tokunaga

    Journal of human genetics 62 (4) 485-489 2017年4月

    DOI: 10.1038/jhg.2016.160  

    詳細を見る 詳細を閉じる

    A genome-wide association study (GWAS) for cold medicine-related Stevens-Johnson syndrome (CM-SJS) with severe ocular complications (SOC) was performed in a Japanese population. A recently developed ethnicity-specific array with genome-wide imputation that was based on the whole-genome sequences of 1070 unrelated Japanese individuals was used. Validation analysis with additional samples from Japanese individuals and replication analysis using samples from Korean individuals identified two new susceptibility loci on chromosomes 15 and 16. This study might suggest the usefulness of GWAS using the ethnicity-specific array and genome-wide imputation based on large-scale whole-genome sequences. Our findings contribute to the understanding of genetic predisposition to CM-SJS with SOC.

  53. Monitoring of minimal residual disease in early T-cell precursor acute lymphoblastic leukaemia by next-generation sequencing. 国際誌

    Xiaoqing Pan, Naoki Nariai, Noriko Fukuhara, Sakae Saito, Yukuto Sato, Fumiki Katsuoka, Kaname Kojima, Yoko Kuroki, Inaho Danjoh, Rumiko Saito, Shin Hasegawa, Yoko Okitsu, Aiko Kondo, Yasushi Onishi, Fuji Nagami, Hideyasu Kiyomoto, Atsushi Hozawa, Nobuo Fuse, Masao Nagasaki, Ritsuko Shimizu, Jun Yasuda, Hideo Harigae, Masayuki Yamamoto

    British journal of haematology 176 (2) 318-321 2017年1月

    DOI: 10.1111/bjh.13948  

  54. Genetic analysis of Japanese primary open-angle glaucoma patients and clinical characterization of risk alleles near CDKN2B-AS1, SIX6 and GAS7. 国際誌

    Yukihiro Shiga, Koji M Nishiguchi, Yosuke Kawai, Kaname Kojima, Kota Sato, Kosuke Fujita, Mai Takahashi, Kazuko Omodaka, Makoto Araie, Kenji Kashiwagi, Makoto Aihara, Takeshi Iwata, Fumihiko Mabuchi, Mitsuko Takamoto, Mineo Ozaki, Kazuhide Kawase, Nobuo Fuse, Masayuki Yamamoto, Jun Yasuda, Masao Nagasaki, Toru Nakazawa

    PloS one 12 (12) e0186678 2017年

    DOI: 10.1371/journal.pone.0186678  

    詳細を見る 詳細を閉じる

    PURPOSE: To test the genetic association between Japanese patients with primary open-angle glaucoma (POAG) and the previously reported POAG susceptibility loci and to perform genotype-phenotype analysis. METHODS: Genetic associations for 27 SNPs from 16 loci previously linked to POAG were assessed using genome-wide SNP data of the primary cohort (565 Japanese POAG patients and 1,104 controls). Reproducibility of the assessment was tested in 607 POAG cases and 455 controls (second cohort) with a targeted genotyping approach. For POAG-associated variants, a genotype-phenotype correlation study (additive, dominant, recessive model) was performed using the objective clinical data derived from 598 eyes of 598 POAG patients. RESULTS: Among 27 SNPs from 16 loci previously linked to POAG, genotypes for total of 20 SNPs in 13 loci were available for targeted association study. Among 8 SNPs in 3 loci that showed at least nominal association (P < 5.00E-02) in the primary cohort, a representative SNP for each loci (rs2157719 for CDKN2B-AS1, rs33912345 for SIX6, and rs9913911 for GAS7) were selected. For these SNPs the association was found significant in both the second cohort analysis and meta-analysis. The genotype-phenotype analysis revealed significant correlations between CDKN2B-AS1 (rs2157719) and decreased intraocular pressure (β = -6.89 mmHg, P = 1.70E-04; dominant model) after multiple corrections. In addition, nominal correlation was observed between CDKN2B-AS1 (rs2157719) and optic nerve head blood flow (β = -0.54 and -0.67 arbitrary units (AU), P = 2.00E-02 and 1.39E-02), between SIX6 (rs33912345) and decreased total peripapillary retinal nerve fiber layer thickness (β = -2.16 and -2.82 μm, P = 4.68E-02 and 2.40E-02, additive and recessive model, respectively) and increased optic nerve head blood flow (β = 0.44 AU, P = 2.20E-02; additive model) and between GAS7 (rs9913911) and increased cup volume (β = 0.03 mm3, P = 4.60E-02) and mean cup depth (β = 0.03 mm3, P = 4.11E-02; additive model) and decreased pattern standard deviation (β = -0.87 dB, P = 2.44E-02; dominant model). CONCLUSION: The association between SNPs near GAS7 and POAG was found in Japanese patients for the first time. Clinical characterization of the risk variants is an important step toward understanding the pathology of the disease and optimizing treatment of patients with POAG.

  55. AP-SKAT: highly-efficient genome-wide rare variant association test. 国際誌 査読有り

    Takanori Hasegawa, Kaname Kojima, Yosuke Kawai, Kazuharu Misawa, Takahiro Mimori, Masao Nagasaki

    BMC genomics 17 (1) 745-745 2016年9月21日

    eISSN:1471-2164

    詳細を見る 詳細を閉じる

    BACKGROUND: Genome-wide association studies have revealed associations between single-nucleotide polymorphisms (SNPs) and phenotypes such as disease symptoms and drug tolerance. To address the small sample size for rare variants, association studies tend to group gene or pathway level variants and evaluate the effect on the set of variants. One of such strategies, known as the sequential kernel association test (SKAT), is a widely used collapsing method. However, the reported p-values from SKAT tend to be biased because the asymptotic property of the statistic is used to calculate the p-value. Although this bias can be corrected by applying permutation procedures for the test statistics, the computational cost of obtaining p-values with high resolution is prohibitive. RESULTS: To address this problem, we devise an adaptive SKAT procedure termed AP-SKAT that efficiently classifies significant SNP sets and ranks them according to the permuted p-values. Our procedure adaptively stops the permutation test when the significance level is outside some confidence interval of the estimated p-value for a binomial distribution. To evaluate the performance, we first compare the power and sample size calculation and the type I error rates estimate of SKAT, SKAT-O, and the proposed procedure using genotype data in the SKAT R package and from 1000 Genome Project. Through computational experiments using whole genome sequencing and SNP array data, we show that our proposed procedure is highly efficient and has comparable accuracy to the standard procedure. CONCLUSIONS: For several types of genetic data, the developed procedure could achieve competitive power and sample size under small and large sample size conditions with controlling considerable type I error rates, and estimate p-values of significant SNP sets that are consistent with those estimated by the standard permutation test within a realistic time. This demonstrates that the procedure is sufficiently powerful for recent whole genome sequencing and SNP array data with increasing numbers of phenotypes. Additionally, this procedure can be used in other association tests by employing alternative methods to calculate the statistics.

  56. The structural origin of metabolic quantitative diversity. 国際誌

    Seizo Koshiba, Ikuko Motoike, Kaname Kojima, Takanori Hasegawa, Matsuyuki Shirota, Tomo Saito, Daisuke Saigusa, Inaho Danjoh, Fumiki Katsuoka, Soichi Ogishima, Yosuke Kawai, Yumi Yamaguchi-Kabata, Miyuki Sakurai, Sachiko Hirano, Junichi Nakata, Hozumi Motohashi, Atsushi Hozawa, Shinichi Kuriyama, Naoko Minegishi, Masao Nagasaki, Takako Takai-Igarashi, Nobuo Fuse, Hideyasu Kiyomoto, Junichi Sugawara, Yoichi Suzuki, Shigeo Kure, Nobuo Yaegashi, Osamu Tanabe, Kengo Kinoshita, Jun Yasuda, Masayuki Yamamoto

    Scientific reports 6 31463-31463 2016年8月16日

    DOI: 10.1038/srep31463  

    詳細を見る 詳細を閉じる

    Relationship between structural variants of enzymes and metabolic phenotypes in human population was investigated based on the association study of metabolite quantitative traits with whole genome sequence data for 512 individuals from a population cohort. We identified five significant associations between metabolites and non-synonymous variants. Four of these non-synonymous variants are located in enzymes involved in metabolic disorders, and structural analyses of these moderate non-synonymous variants demonstrate that they are located in peripheral regions of the catalytic sites or related regulatory domains. In contrast, two individuals with larger changes of metabolite levels were also identified, and these individuals retained rare variants, which caused non-synonymous variants located near the catalytic site. These results are the first demonstrations that variant frequency, structural location, and effect for phenotype correlate with each other in human population, and imply that metabolic individuality and susceptibility for diseases may be elicited from the moderate variants and much more deleterious but rare variants.

  57. Fine-mapping analysis revealed complex pleiotropic effect and tissue-specific regulatory mechanism of TNFSF15 in primary biliary cholangitis, Crohn's disease and leprosy. 国際誌

    Yonghu Sun, Astrid Irwanto, Licht Toyo-Oka, Myunghee Hong, Hong Liu, Anand Kumar Andiappan, Hyunchul Choi, Yuki Hitomi, Gongqi Yu, Yongxiang Yu, Fangfang Bao, Chuan Wang, Xian Fu, Zhenhua Yue, Honglei Wang, Huimin Zhang, Minae Kawashima, Kaname Kojima, Masao Nagasaki, Minoru Nakamura, Suk-Kyun Yang, Byong Duk Ye, Yosua Denise, Olaf Rotzschke, Kyuyoung Song, Katsushi Tokunaga, Furen Zhang, Jianjun Liu

    Scientific reports 6 31429-31429 2016年8月10日

    DOI: 10.1038/srep31429  

    詳細を見る 詳細を閉じる

    Genetic polymorphism within the 9q32 locus is linked with increased risk of several diseases, including Crohn's disease (CD), primary biliary cholangitis (PBC) and leprosy. The most likely disease-causing gene within 9q32 is TNFSF15, which encodes the pro-inflammatory cytokine TNF super-family member 15, but it was unknown whether these disparate diseases were associated with the same genetic variance in 9q32, and how variance within this locus might contribute to pathology. Using genetic data from published studies on CD, PBC and leprosy we revealed that bearing a T allele at rs6478108/rs6478109 (r(2) = 1) or rs4979462 was significantly associated with increased risk of CD and decreased risk of leprosy, while the T allele at rs4979462 was associated with significantly increased risk of PBC. In vitro analyses showed that the rs6478109 genotype significantly affected TNFSF15 expression in cells from whole blood of controls, while functional annotation using publicly-available data revealed the broad cell type/tissue-specific regulatory potential of variance at rs6478109 or rs4979462. In summary, we provide evidence that variance within TNFSF15 has the potential to affect cytokine expression across a range of tissues and thereby contribute to protection from infectious diseases such as leprosy, while increasing the risk of immune-mediated diseases including CD and PBC.

  58. [Construction of 1070 Whole-genome Japanese Reference Panel and Bioinformatics].

    Masao Nagasaki, Yosuke Kawai, Kaname Kojima, Takahiro Mimori, Yumi Yamaugchi-Kabata

    Seikagaku. The Journal of Japanese Biochemical Society 88 (1) 15-24 2016年2月

    ISSN:0037-1017

  59. A Bayesian approach for estimating allele-specific expression from RNA-Seq data with diploid genomes. 国際誌

    Naoki Nariai, Kaname Kojima, Takahiro Mimori, Yosuke Kawai, Masao Nagasaki

    BMC genomics 17 Suppl 1 2-2 2016年1月11日

    DOI: 10.1186/s12864-015-2295-5  

    詳細を見る 詳細を閉じる

    BACKGROUND: RNA-sequencing (RNA-Seq) has become a popular tool for transcriptome profiling in mammals. However, accurate estimation of allele-specific expression (ASE) based on alignments of reads to the reference genome is challenging, because it contains only one allele on a mosaic haploid genome. Even with the information of diploid genome sequences, precise alignment of reads to the correct allele is difficult because of the high-similarity between the corresponding allele sequences. RESULTS: We propose a Bayesian approach to estimate ASE from RNA-Seq data with diploid genome sequences. In the statistical framework, the haploid choice is modeled as a hidden variable and estimated simultaneously with isoform expression levels by variational Bayesian inference. Through the simulation data analysis, we demonstrate the effectiveness of the proposed approach in terms of identifying ASE compared to the existing approach. We also show that our approach enables better quantification of isoform expression levels compared to the existing methods, TIGAR2, RSEM and Cufflinks. In the real data analysis of the human reference lymphoblastoid cell line GM12878, some autosomal genes were identified as ASE genes, and skewed paternal X-chromosome inactivation in GM12878 was identified. CONCLUSIONS: The proposed method, called ASE-TIGAR, enables accurate estimation of gene expression from RNA-Seq data in an allele-specific manner. Our results show the effectiveness of utilizing personal genomic information for accurate estimation of ASE. An implementation of our method is available at http://nagasakilab.csml.org/ase-tigar .

  60. Whole-genome Japanese Reference Panel and future directions 査読有り

    Masao Nagasaki, Jun Yasuda, Fumiki Katsuoka, Naoki Nariai, Kaname Kojima, Yosuke Kawai, Yumi Yamaguchi-Kabata, Junji Yokozawa, Inaho Danjoh, Sakae Saito, Yukuto Sato, Takahiro Mimori, Kaoru Tsuda, Rumiko Saito, Pan Xiaoqing, Satoshi Nishikawa, Shin Ito, Yoko Kuroki, Osamu Tanabe, Nobuo Fuse, Shinichi Kuriyama, Hideyasu Kiyomoto, Atsushi Hozawa, Naoko Minegishi, Kengo Kinoshita, Shigeo Kure, Nobuo Yaegashi, Masayuki Yamamoto

    GENES & GENETIC SYSTEMS 90 (6) 377-377 2015年12月

    ISSN:1341-7568

    eISSN:1880-5779

  61. Japonica array: improved genotype imputation by designing a population-specific SNP array with 1070 Japanese individuals. 国際誌

    Yosuke Kawai, Takahiro Mimori, Kaname Kojima, Naoki Nariai, Inaho Danjoh, Rumiko Saito, Jun Yasuda, Masayuki Yamamoto, Masao Nagasaki

    Journal of human genetics 60 (10) 581-7 2015年10月

    DOI: 10.1038/jhg.2015.68  

    詳細を見る 詳細を閉じる

    The Tohoku Medical Megabank Organization constructed the reference panel (referred to as the 1KJPN panel), which contains >20 million single nucleotide polymorphisms (SNPs), from whole-genome sequence data from 1070 Japanese individuals. The 1KJPN panel contains the largest number of haplotypes of Japanese ancestry to date. Here, from the 1KJPN panel, we designed a novel custom-made SNP array, named the Japonica array, which is suitable for whole-genome imputation of Japanese individuals. The array contains 659,253 SNPs, including tag SNPs for imputation, SNPs of Y chromosome and mitochondria, and SNPs related to previously reported genome-wide association studies and pharmacogenomics. The Japonica array provides better imputation performance for Japanese individuals than the existing commercially available SNP arrays with both the 1KJPN panel and the International 1000 genomes project panel. For common SNPs (minor allele frequency (MAF)>5%), the genomic coverage of the Japonica array (r(2)>0.8) was 96.9%, that is, almost all common SNPs were covered by this array. Nonetheless, the coverage of low-frequency SNPs (0.5%<MAF⩽5%) of the Japonica array reached 67.2%, which is higher than those of the existing arrays. In addition, we confirmed the high quality genotyping performance of the Japonica array using the 288 samples in 1KJPN; the average call rate 99.7% and the average concordance rate 99.7% to the genotypes obtained from high-throughput sequencer. As demonstrated in this study, the creation of custom-made SNP arrays based on a population-specific reference panel is a practical way to facilitate further association studies through genome-wide genotype imputations.

  62. Short tandem repeat number estimation from paired-end sequence reads by considering unobserved genealogy of multiple individuals 査読有り

    Kaname Kojima, Yosuke Kawai, Naoki Nariai, Takahiro Mimori, Takanori Hasegawa, Masao Nagasaki

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9096 422-423 2015年

    出版者・発行元: Springer Verlag

    DOI: 10.1007/978-3-319-19048-8  

    ISSN:1611-3349 0302-9743

  63. iJGVD: an integrative Japanese genome variation database based on whole-genome sequencing. 国際誌

    Yumi Yamaguchi-Kabata, Naoki Nariai, Yosuke Kawai, Yukuto Sato, Kaname Kojima, Minoru Tateno, Fumiki Katsuoka, Jun Yasuda, Masayuki Yamamoto, Masao Nagasaki

    Human genome variation 2 15050-15050 2015年

    DOI: 10.1038/hgv.2015.50  

    詳細を見る 詳細を閉じる

    The integrative Japanese Genome Variation Database (iJGVD; http://ijgvd.megabank.tohoku.ac.jp/) provides genomic variation data detected by whole-genome sequencing (WGS) of Japanese individuals. Specifically, the database contains variants detected by WGS of 1,070 individuals who participated in a genome cohort study of the Tohoku Medical Megabank Project. In the first release, iJGVD includes >4,300,000 autosomal single nucleotide variants (SNVs) whose minor allele frequencies are >5.0%.

  64. Estimating copy numbers of alleles from population-scale high-throughput sequencing data. 国際誌

    Takahiro Mimori, Naoki Nariai, Kaname Kojima, Yukuto Sato, Yosuke Kawai, Yumi Yamaguchi-Kabata, Masao Nagasaki

    BMC bioinformatics 16 Suppl 1 S4 2015年

    DOI: 10.1186/1471-2105-16-S1-S4  

    詳細を見る 詳細を閉じる

    BACKGROUND: With the recent development of microarray and high-throughput sequencing (HTS) technologies, a number of studies have revealed catalogs of copy number variants (CNVs) and their association with phenotypes and complex traits. In parallel, a number of approaches to predict CNV regions and genotypes are proposed for both microarray and HTS data. However, only a few approaches focus on haplotyping of CNV loci. RESULTS: We propose a novel approach to infer copy unit alleles and their numbers in each sample simultaneously from population-scale HTS data by variational Bayesian inference on a generative probabilistic model inspired by latent Dirichlet allocation, which is a well studied model for document classification problems. In simulation studies, we evaluated concordance between inferred and true copy unit alleles for lower-, middle-, and higher-copy number dataset, in which precision and recall were ≥ 0.9 for data with mean coverage ≥ 10× per copy unit. We also applied the approach to HTS data of 1123 samples at highly variable salivary amylase gene locus and a pseudogene locus, and confirmed consistency of the estimated alleles within samples belonging to a trio of CEPH/Utah pedigree 1463 with 11 offspring. CONCLUSIONS: Our proposed approach enables detailed analysis of copy number variations, such as association study between copy unit alleles and phenotypes or biological features including human diseases.

  65. HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data. 国際誌

    Naoki Nariai, Kaname Kojima, Sakae Saito, Takahiro Mimori, Yukuto Sato, Yosuke Kawai, Yumi Yamaguchi-Kabata, Jun Yasuda, Masao Nagasaki

    BMC genomics 16 Suppl 2 S7 2015年

    DOI: 10.1186/1471-2164-16-S2-S7  

    詳細を見る 詳細を閉じる

    BACKGROUND: Human leucocyte antigen (HLA) genes play an important role in determining the outcome of organ transplantation and are linked to many human diseases. Because of the diversity and polymorphisms of HLA loci, HLA typing at high resolution is challenging even with whole-genome sequencing data. RESULTS: We have developed a computational tool, HLA-VBSeq, to estimate the most probable HLA alleles at full (8-digit) resolution from whole-genome sequence data. HLA-VBSeq simultaneously optimizes read alignments to HLA allele sequences and abundance of reads on HLA alleles by variational Bayesian inference. We show the effectiveness of the proposed method over other methods through the analysis of predicting HLA types for HLA class I (HLA-A, -B and -C) and class II (HLA-DQA1,-DQB1 and -DRB1) loci from the simulation data of various depth of coverage, and real sequencing data of human trio samples. CONCLUSIONS: HLA-VBSeq is an efficient and accurate HLA typing method using high-throughput sequencing data without the need of primer design for HLA loci. Moreover, it does not assume any prior knowledge about HLA allele frequencies, and hence HLA-VBSeq is broadly applicable to human samples obtained from a genetically diverse population.

  66. 胎生期ストレスが胎児のエピゲノム変化および精神行動に及ぼす影響の特定 査読有り

    兪 志前, 舟山 亮, 植野 和子, 成相 直樹, 小島 要, 小野 千晶, 笠原 好之, 長崎 正朗, 中山 啓子, 富田 博秋

    日本臨床精神神経薬理学会・日本神経精神薬理学会合同年会プログラム・抄録集 24回・44回 204-204 2014年11月

    出版者・発行元: 日本臨床精神神経薬理学会・日本神経精神薬理学会

  67. Identification of acquired mutations by whole-genome sequencing in GATA-2 deficiency evolving into myelodysplasia and acute leukemia. 国際誌

    Tohru Fujiwara, Noriko Fukuhara, Ryo Funayama, Naoki Nariai, Mayumi Kamata, Takeshi Nagashima, Kaname Kojima, Yasushi Onishi, Yoji Sasahara, Kenichi Ishizawa, Masao Nagasaki, Keiko Nakayama, Hideo Harigae

    Annals of hematology 93 (9) 1515-22 2014年9月

    DOI: 10.1007/s00277-014-2090-4  

    詳細を見る 詳細を閉じる

    Heterozygous GATA-2 germline mutations are associated with overlapping clinical manifestations termed GATA-2 deficiency, characterized by immunodeficiency and predisposition to myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). However, there is considerable clinical heterogeneity among patients, and the molecular basis for the evolution of immunodeficiency into MDS/AML remains unknown. Thus, we conducted whole-genome sequencing on a patient with a germline GATA-2 heterozygous mutation (c. 988 C > T; p. R330X), who had a history suggestive of immunodeficiency and evolved into MDS/AML. Analysis was conducted with DNA samples from leukocytes for immunodeficiency, bone marrow mononuclear cells for MDS and bone marrow-derived mesenchymal stem cells. Whereas we did not identify a candidate genomic deletion that may contribute to the evolution into MDS, a total of 280 MDS-specific nonsynonymous single nucleotide variants were identified. By narrowing down with the single nucleotide polymorphism database, the functional missense database, and NCBI information, we finally identified three candidate mutations for EZH2, HECW2 and GATA-1, which may contribute to the evolution of the disease.

  68. Validation of multiple single nucleotide variation calls by additional exome analysis with a semiconductor sequencer to supplement data of whole-genome sequencing of a human population. 国際誌

    Ikuko N Motoike, Mitsuyo Matsumoto, Inaho Danjoh, Fumiki Katsuoka, Kaname Kojima, Naoki Nariai, Yukuto Sato, Yumi Yamaguchi-Kabata, Shin Ito, Hisaaki Kudo, Ichiko Nishijima, Satoshi Nishikawa, Xiaoqing Pan, Rumiko Saito, Sakae Saito, Tomo Saito, Matsuyuki Shirota, Kaoru Tsuda, Junji Yokozawa, Kazuhiko Igarashi, Naoko Minegishi, Osamu Tanabe, Nobuo Fuse, Masao Nagasaki, Kengo Kinoshita, Jun Yasuda, Masayuki Yamamoto

    BMC genomics 15 673-673 2014年8月10日

    DOI: 10.1186/1471-2164-15-673  

    詳細を見る 詳細を閉じる

    BACKGROUND: Validation of single nucleotide variations in whole-genome sequencing is critical for studying disease-related variations in large populations. A combination of different types of next-generation sequencers for analyzing individual genomes may be an efficient means of validating multiple single nucleotide variations calls simultaneously. RESULTS: Here, we analyzed 12 independent Japanese genomes using two next-generation sequencing platforms: the Illumina HiSeq 2500 platform for whole-genome sequencing (average depth 32.4×), and the Ion Proton semiconductor sequencer for whole exome sequencing (average depth 109×). Single nucleotide polymorphism (SNP) calls based on the Illumina Human Omni 2.5-8 SNP chip data were used as the reference. We compared the variant calls for the 12 samples, and found that the concordance between the two next-generation sequencing platforms varied between 83% and 97%. CONCLUSIONS: Our results show the versatility and usefulness of the combination of exome sequencing with whole-genome sequencing in studies of human population genetics and demonstrate that combining data from multiple sequencing platforms is an efficient approach to validate and supplement SNP calls.

  69. SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing. 国際誌

    Yukuto Sato, Kaname Kojima, Naoki Nariai, Yumi Yamaguchi-Kabata, Yosuke Kawai, Mamoru Takahashi, Takahiro Mimori, Masao Nagasaki

    BMC genomics 15 664-664 2014年8月8日

    DOI: 10.1186/1471-2164-15-664  

    詳細を見る 詳細を閉じる

    BACKGROUND: Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. RESULTS: We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. CONCLUSION: The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

  70. SVEM: A Structural Variant Estimation Method Using Multi-mapped Reads on Breakpoints

    Tomohiko Ohtsuki, Naoki Nariai, Kaname Kojima, Takahiro Mimori, Yukuto Sato, Yosuke Kawai, Yumi Yamaguchi-Kabata, Testuo Shibuya, Masao Nagasaki

    ALGORITHMS FOR COMPUTATIONAL BIOLOGY 8542 208-219 2014年

    ISSN:0302-9743

    eISSN:1611-3349

  71. HapMonster: A Statistically Unified Approach for Variant Calling and Haplotyping Based on Phase-Informative Reads 査読有り

    Kaname Kojima, Naoki Nariai, Takahiro Mimori, Yumi Yamaguchi-Kabata, Yukuto Sato, Yosuke Kawai, Masao Nagasaki

    ALGORITHMS FOR COMPUTATIONAL BIOLOGY 8542 107-118 2014年

    ISSN:0302-9743

  72. TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads. 国際誌

    Naoki Nariai, Kaname Kojima, Takahiro Mimori, Yukuto Sato, Yosuke Kawai, Yumi Yamaguchi-Kabata, Masao Nagasaki

    BMC genomics 15 Suppl 10 S5 2014年

    DOI: 10.1186/1471-2164-15-S10-S5  

    詳細を見る 詳細を閉じる

    BACKGROUND: High-throughput RNA sequencing (RNA-Seq) enables quantification and identification of transcripts at single-base resolution. Recently, longer sequence reads become available thanks to the development of new types of sequencing technologies as well as improvements in chemical reagents for the Next Generation Sequencers. Although several computational methods have been proposed for quantifying gene expression levels from RNA-Seq data, they are not sufficiently optimized for longer reads (e.g. >250 bp). RESULTS: We propose TIGAR2, a statistical method for quantifying transcript isoforms from fixed and variable length RNA-Seq data. Our method models substitution, deletion, and insertion errors of sequencers based on gapped-alignments of reads to the reference cDNA sequences so that sensitive read-aligners such as Bowtie2 and BWA-MEM are effectively incorporated in our pipeline. Also, a heuristic algorithm is implemented in variational Bayesian inference for faster computation. We apply TIGAR2 to both simulation data and real data of human samples and evaluate performance of transcript quantification with TIGAR2 in comparison to existing methods. CONCLUSIONS: TIGAR2 is a sensitive and accurate tool for quantifying transcript isoform abundances from RNA-Seq data. Our method performs better than existing methods for the fixed-length reads (100 bp, 250 bp, 500 bp, and 1000 bp of both single-end and paired-end) and variable-length reads, especially for reads longer than 250 bp.

  73. TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference. 国際誌

    Naoki Nariai, Osamu Hirose, Kaname Kojima, Masao Nagasaki

    Bioinformatics (Oxford, England) 29 (18) 2292-9 2013年9月15日

    DOI: 10.1093/bioinformatics/btt381  

    詳細を見る 詳細を閉じる

    MOTIVATION: Many human genes express multiple transcript isoforms through alternative splicing, which greatly increases diversity of protein function. Although RNA sequencing (RNA-Seq) technologies have been widely used in measuring amounts of transcribed mRNA, accurate estimation of transcript isoform abundances from RNA-Seq data is challenging because reads often map to more than one transcript isoforms or paralogs whose sequences are similar to each other. RESULTS: We propose a statistical method to estimate transcript isoform abundances from RNA-Seq data. Our method can handle gapped alignments of reads against reference sequences so that it allows insertion or deletion errors within reads. The proposed method optimizes the number of transcript isoforms by variational Bayesian inference through an iterative procedure, and its convergence is guaranteed under a stopping criterion. On simulated datasets, our method outperformed the comparable quantification methods in inferring transcript isoform abundances, and at the same time its rate of convergence was faster than that of the expectation maximization algorithm. We also applied our method to RNA-Seq data of human cell line samples, and showed that our prediction result was more consistent among technical replicates than those of other methods. AVAILABILITY: An implementation of our method is available at http://github.com/nariai/tigar CONTACT: nariai@megabank.tohoku.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  74. iSVP: an integrated structural variant calling pipeline from high-throughput sequencing data. 国際誌

    Takahiro Mimori, Naoki Nariai, Kaname Kojima, Mamoru Takahashi, Akira Ono, Yukuto Sato, Yumi Yamaguchi-Kabata, Masao Nagasaki

    BMC systems biology 7 Suppl 6 S8 2013年

    DOI: 10.1186/1752-0509-7-S6-S8  

    詳細を見る 詳細を閉じる

    BACKGROUND: Structural variations (SVs), such as insertions, deletions, inversions, and duplications, are a common feature in human genomes, and a number of studies have reported that such SVs are associated with human diseases. Although the progress of next generation sequencing (NGS) technologies has led to the discovery of a large number of SVs, accurate and genome-wide detection of SVs remains challenging. Thus far, various calling algorithms based on NGS data have been proposed. However, their strategies are diverse and there is no tool able to detect a full range of SVs accurately. RESULTS: We focused on evaluating the performance of existing deletion calling algorithms for various spanning ranges from low- to high-coverage simulation data. The simulation data was generated from a whole genome sequence with artificial SVs constructed based on the distribution of variants obtained from the 1000 Genomes Project. From the simulation analysis, deletion calls of various deletion sizes were obtained with each caller, and it was found that the performance was quite different according to the type of algorithms and targeting deletion size. Based on these results, we propose an integrated structural variant calling pipeline (iSVP) that combines existing methods with a newly devised filtering and merging processes. It achieved highly accurate deletion calling with >90% precision and >90% recall on the 30× read data for a broad range of size. We applied iSVP to the whole-genome sequence data of a CEU HapMap sample, and detected a large number of deletions, including notable peaks around 300 bp and 6,000 bp, which corresponded to Alus and long interspersed nuclear elements, respectively. In addition, many of the predicted deletions were highly consistent with experimentally validated ones by other studies. CONCLUSIONS: We present iSVP, a new deletion calling pipeline to obtain a genome-wide landscape of deletions in a highly accurate manner. From simulation and real data analysis, we show that iSVP is broadly applicable to human whole-genome sequencing data, which will elucidate relationships between SVs across genomes and associated diseases or biological functions.

  75. Functional clustering of time series gene expression data by Granger causality. 国際誌

    André Fujita, Patricia Severino, Kaname Kojima, João Ricardo Sato, Alexandre Galvão Patriota, Satoru Miyano

    BMC systems biology 6 137-137 2012年10月30日

    DOI: 10.1186/1752-0509-6-137  

    詳細を見る 詳細を閉じる

    BACKGROUND: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. RESULTS: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. CONCLUSIONS: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

  76. Searching optimal Bayesian network structure on constraint search space: Super-structure approach

    Seiya Imoto, Kaname Kojima, Eric Perrier, Yoshinori Tamada, Satoru Miyano

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6797 210-218 2011年

    DOI: 10.1007/978-3-642-25655-4_19  

    ISSN:0302-9743 1611-3349

  77. Cell illustrator 4.0: a computational platform for systems biology. 国際誌

    Masao Nagasaki, Ayumu Saito, Euna Jeong, Chen Li, Kaname Kojima, Emi Ikeda, Satoru Miyano

    Studies in health technology and informatics 162 160-81 2011年

    ISSN:0926-9630

    詳細を見る 詳細を閉じる

    Cell Illustrator is a software platform for Systems Biology that uses the concept of Petri net for modeling and simulating biopathways. It is intended for biological scientists working at bench. The latest version of Cell Illustrator 4.0 uses Java Web Start technology and is enhanced with new capabilities, including: automatic graph grid layout algorithms using ontology information; tools using Cell System Markup Language (CSML) 3.0 and Cell System Ontology 3.0; parameter search module; high-performance simulation module; CSML database management system; conversion from CSML model to programming languages (FORTRAN, C, C++, Java, Python and Perl); import from SBML, CellML, and BioPAX; and, export to SVG and HTML. Cell Illustrator employs an extension of hybrid Petri net in an object-oriented style so that biopathway models can include objects such as DNA sequence, molecular density, 3D localization information, transcription with frame-shift, translation with codon table, as well as biochemical reactions.

  78. A fast and robust statistical test based on likelihood ratio with Bartlett correction to identify Granger causality between gene sets. 国際誌 査読有り

    André Fujita, Kaname Kojima, Alexandre G Patriota, João R Sato, Patricia Severino, Satoru Miyano

    Bioinformatics 26 (18) 2349-51 2010年9月15日

    DOI: 10.1093/bioinformatics/btq427  

    詳細を見る 詳細を閉じる

    UNLABELLED: We propose a likelihood ratio test (LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrap-based approach. LRT is shown to be significantly faster and statistically powerful even within non-Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests is also provided. AVAILABILITY: http://dnagarden.ims.u-tokyo.ac.jp/afujita/en/doku.php?id=ggranger.

  79. Identification of granger causality between gene sets 査読有り

    André Fujita, João Ricardo Sato, Kaname Kojima, Luciana Rodrigues Gomes, Masao Nagasaki, Mari Cleide Sogayar, Satoru Miyano

    Journal of Bioinformatics and Computational Biology 8 (4) 679-701 2010年8月

    DOI: 10.1142/S0219720010004860  

    ISSN:0219-7200

  80. Identifying hidden confounders in gene networks by Bayesian networks 査読有り

    Tomoya Higashigaki, Kaname Kojima, Rui Yamaguchi, Masato Inoue, Seiya Imoto, Satoru Miyano

    10th IEEE International Conference on Bioinformatics and Bioengineering 2010, BIBE 2010 168-173 2010年

    DOI: 10.1109/BIBE.2010.35  

  81. Cell Illustrator 4.0: a computational platform for systems biology. 国際誌

    Masao Nagasaki, Ayumu Saito, Euna Jeong, Chen Li, Kaname Kojima, Emi Ikeda, Satoru Miyano

    In silico biology 10 (1) 5-26 2010年

    DOI: 10.3233/ISB-2010-0415  

    詳細を見る 詳細を閉じる

    Cell Illustrator is a software platform for Systems Biology that uses the concept of Petri net for modeling and simulating biopathways. It is intended for biological scientists working at bench. The latest version of Cell Illustrator 4.0 uses Java Web Start technology and is enhanced with new capabilities, including: automatic graph grid layout algorithms using ontology information; tools using Cell System Markup Language (CSML) 3.0 and Cell System Ontology 3.0; parameter search module; high-performance simulation module; CSML database management system; conversion from CSML model to programming languages (FORTRAN, C, C++, Java, Python and Perl); import from SBML, CellML, and BioPAX; and, export to SVG and HTML. Cell Illustrator employs an extension of hybrid Petri net in an object-oriented style so that biopathway models can include objects such as DNA sequence, molecular density, 3D localization information, transcription with frame-shift, translation with codon table, as well as biochemical reactions.

  82. BFL: a node and edge betweenness based fast layout algorithm for large scale networks. 国際誌

    Tatsunori B Hashimoto, Masao Nagasaki, Kaname Kojima, Satoru Miyano

    BMC bioinformatics 10 19-19 2009年1月15日

    DOI: 10.1186/1471-2105-10-19  

    詳細を見る 詳細を閉じる

    BACKGROUND: Network visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements. RESULTS: To overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths. CONCLUSION: Our BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer.

  83. A weight estimation method using LDA for multi-band speech recognition 査読有り

    Koji Iwano, Kaname Kojima, Sadaoki Furui

    INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP 5 2534-2537 2006年

︎全件表示 ︎最初の5件までを表示

MISC 14

  1. 日本人約6万人の血清総IgE値のGWAS並びにMHC関連解析

    志藤光介, 志藤光介, 志藤光介, 小島要, 木下賢吾

    アレルギー 73 (6/7) 2024年

    ISSN: 0021-4884

  2. 日本人成人GWASによる肺機能・FeNO関連遺伝子の同定

    山田 充啓, 元池 育子, 小島 要, 光根 歩, 大江 崇, 一ノ瀬 正和, 山本 雅之, 杉浦 久敏

    日本呼吸器学会誌 11 (増刊) 129-129 2022年4月

    出版者・発行元: (一社)日本呼吸器学会

    ISSN: 2186-5876

    eISSN: 2186-5884

  3. GWAS and RNA Expression Analysis for Gestational Hypertension using Time-Series Home Blood Pressure Data in Maternity Log Study.

    Yoshiki Tsunemoto, Takafumi Yamauchi, Daisuke Ochi, Satsuki Kumatani, Takahiro Mimori, Kaname Kojima, Riu Yamashita, Maiko Wagata, Fumiki Katsuoka, Osamu Tanabe, Naoko Minegishi, Satoshi Hiyama, Masao Nagasaki, Junichi Sugawara

    REPRODUCTIVE SCIENCES 26 184A-184A 2019年3月

    ISSN: 1933-7191

    eISSN: 1933-7205

  4. Estimating frequency of pathogenic variants in a Japanese population by using the whole-genome reference panel of ToMMo

    Yumi Yamaguchi-Kabata, Jun Yasuda, Osamu Tanabe, Yoichi Suzuki, Hiroshi Kawame, Nobuo Fuse, Masao Nagasaki, Yosuke Kawai, Kaname Kojima, Fumiki Katsuoka, Sakae Saito, Inaho Danjoh, Ikuko N. Motoike, Riu Yamashita, Seizo Koshiba, Daisuke Saigusa, Gen Tamiya, Shigeo Kure, Nobuo Yaegashi, Yoshio Kawaguchi, Fuji Nagami, Shinichi Kuriyama, Junichi Sugawara, Naoko Minegishi, Atsushi Hozawa, Soichi Ogishima, Hideyasu Kiyomoto, Takako Takai-Igarashi, Kengo Kinoshita, Masayuki Yamamoto

    HUMAN GENOMICS 12 2018年3月

    ISSN: 1473-9542

    eISSN: 1479-7364

  5. 卵巣明細胞がんを引き起こす遺伝子変異の発見(Discovery of gene alterations causing ovarian clear cell carcinoma)

    Shibuya Yusuke, Tokunaga Hideki, Saito Sakae, Kojima Kaname, Li Bin, Nagasaki Masao, Yasuda Jun, Yaegashi Nobuo

    日本婦人科腫瘍学会雑誌 35 (3) 566-566 2017年6月

    出版者・発行元: (公社)日本婦人科腫瘍学会

    ISSN: 1347-8559

  6. 早発型発達緑内障における原因遺伝子の探索

    布施 昇男, 木村 雅恵, 清水 愛, 河合 洋介, 小島 要, 長崎 正朗, 濱中 輝彦, 石田 誠夫, 中村 誠, 酒井 寛, 池田 陽子, 森 和彦, 中澤 徹, 勝岡 史城, 安田 純, 山本 雅之

    日本眼科学会雑誌 121 (臨増) 227-227 2017年3月

    出版者・発行元: (公財)日本眼科学会

    ISSN: 0029-0203

  7. 次世代シークエンサーを用いた卵巣明細胞腺癌の遺伝子変異の探索

    渋谷 祐介, 齋藤 さかえ, 小島 要, 李 賓, 徳永 英樹, 長崎 正朗, 安田 純, 八重樫 伸生

    日本癌学会総会記事 75回 P-1369 2016年10月

    出版者・発行元: 日本癌学会

    ISSN: 0546-0476

  8. マルチオミクスが解き明かす疾患生物学 日本人多層オミックス参照パネルの拡張

    小柴 生造, 三枝 大輔, 元池 育子, 小島 要, 城田 松之, 齋藤 智, 勝岡 史城, 河合 洋介, 山口 由美, 田邉 修, 長崎 正郎, 安田 純, 木下 賢吾, 山本 雅之

    日本生化学会大会プログラム・講演要旨集 89回 [1S05-5] 2016年9月

    出版者・発行元: (公社)日本生化学会

  9. 卵巣明細胞腺癌を引き起こす遺伝子変異の発見(Discovery of gene alterations causing ovarian clear cell carcinoma)

    渋谷 祐介, 徳永 英樹, 安田 純, 長崎 正朗, 斎藤 さかえ, 小島 要, 李 賓, 八重樫 伸生

    日本婦人科腫瘍学会雑誌 34 (3) 452-452 2016年6月

    出版者・発行元: (公社)日本婦人科腫瘍学会

    ISSN: 1347-8559

  10. Inference of negative selection on human genome from whole genome sequences of 1070 individuals

    Yosuke Kawai, Naoki Nariai, Kaname Kojima, Yumi Yamaguchi-Kabata, Yukuto Sato, Takahiro Mimori, Masaco Nagasaki

    GENES & GENETIC SYSTEMS 90 (6) 379-379 2015年12月

    ISSN: 1341-7568

    eISSN: 1880-5779

  11. Estimation of allele frequency of pathological variants based on whole-genome sequencing of 1070 Japanese individuals

    Yumi Yamaguchi-Kabata, Yosuke Kawai, Kaname Kojima, Naoki Nariai, Takahiro Mimori, Yukuto Sato, Fumiki Katsuoka, Jun Yasuda, Masayuki Yamamoto, Masao Nagasaki

    GENES & GENETIC SYSTEMS 90 (6) 379-379 2015年12月

    ISSN: 1341-7568

    eISSN: 1880-5779

  12. SUGAR: graphical user interface-based high-resolution data cleaning tool for high-throughput sequencing data

    Yukuto Sato, Kaname Kojima, Naoki Nariai, Yumi Yamaguchi-Kabata, Yosuke Kawai, Masao Nagasaki

    GENES & GENETIC SYSTEMS 89 (6) 329-329 2014年12月

    ISSN: 1341-7568

    eISSN: 1880-5779

  13. Nonparametric inference of population demography from SNP data

    Yosuke Kawai, Yukuto Sato, Yumi Yamaguchi, Naoki Nariai, Sachiyo Sugimoto, Takahiro Mimori, Kaname Kojima, Masao Nagasaki

    GENES & GENETIC SYSTEMS 89 (6) 314-314 2014年12月

    ISSN: 1341-7568

    eISSN: 1880-5779

  14. 眼疾患と遺伝子 緑内障のゲノム解析 次世代医療・個別化医療に向けて

    布施 昇男, 清水 愛, 木村 雅恵, 高野 良真, 石 棟, 宮澤 晃子, 国松 志保, 劉 孟林, 渡邉 亮, 安田 正幸, 横山 悠, 檜森 紀子, 津田 聡, 山本 耕太郎, 中澤 徹, 安田 純, 勝岡 史城, 小島 要, 成相 直樹, 松本 光代, 元池 育子, 長崎 正朗, 木下 賢吾, 五十嵐 和彦, 山本 雅之, 新堀 哲也, 青木 洋子, 松原 洋一, 舟山 亮, 長嶋 剛史, 中山 啓子, 眞島 行彦, 船山 智代, 田中 光一, 原田 高幸, 阿部 春樹, 福地 健郎, 安田 典子, 出田 秀尚, 鄭 暁東, 白石 敦, 大橋 裕一, 石田 誠夫, 原 岳, 金森 章泰, 山田 裕子, 中村 誠, 酒井 寛, Richards Julia E.

    日本眼科学会雑誌 118 (3) 216-240 2014年3月

    出版者・発行元: (公財)日本眼科学会

    ISSN: 0029-0203

︎全件表示 ︎最初の5件までを表示

共同研究・競争的資金等の研究課題 8

  1. アレルギー疾患機序解明のためのHLA多型を考慮した解釈可能AIゲノム解析基盤構築

    小島 要, 山崎 研志

    2023年4月 ~ 2026年3月

    詳細を見る 詳細を閉じる

    IgE抗体はアレルギー物質と結びつきヒスタミンなどの放出を誘導し炎症反応を引き起こす生体内分子であり、血清総IgE値の多寡がアレルギー体質と関連することが知られている。アレルギー疾患は日常生活の質に関わる重要な疾患であるが、その作用機序や遺伝的背景において未解明な点も多い。これまでアレルギー体質の遺伝的背景の解明を目的として、東北メディカル・メガバンク計画によるゲノムコホートデータのうちSNPアレイにより取得された約1万検体の遺伝子型データについて血清総IgE値の関連解析を行い、ゲノム上の主要組織適合性複合体(MHC)とIL4R遺伝子上のミスセンス変異を関連座位および関連変異として同定している。IL4R遺伝子がコードするIL-4受容体αはIgE抗体の産生と関わっており、IL-4受容体α阻害薬がアトピー性皮膚炎や喘息に高い効果を示すことが知られている。一方、MHCは関連座位として非常に強いシグナルを示すが、難読領域であることから関連座位として報告するにとどまっていた。現在、血清総IgE値との関連変異のさらなる獲得を目的として、数万検体に拡大されたSNPアレイからの遺伝子型データを用いて血清総IgE値との関連解析を進めている。また、MHCと血清総IgE値との高解像度化を目的として、MHCに局在するHLA遺伝子のアレルをSNPアレイからの遺伝子型データより推定し、血清IgE値との関連解析を進めている。HLA遺伝子は免疫応答における抗原提示を行う分子をコードする遺伝子であり、HLA遺伝子としてHLA-A、-B、-C、-E、-F、-G、-DRB1、-DQA1、-DQB1、-DPA1、-DPB1を解析対象としている。HLA遺伝子との関連解析では、アリル単位での関連解析だけでなく抗原提示部位におけるアミノ酸置換変異情報に変換することで、アミノ酸置換変異レベルでの関連解析も行っている。

  2. 紫外線感受性核酸多型に基づく尋常性白斑発症予測プログラムの構築

    山崎 研志, 小島 要, 元池 育子

    2022年4月 ~ 2025年3月

  3. 網膜層厚と遺伝環境因子による孤発性晩期発症型アルツハイマー病前臨床期診断法の確立

    平良 摩紀子, 布施 昇男, 川崎 良, 三木 篤也, 小島 要, 田高 周

    2021年4月 ~ 2024年3月

  4. 深層学習及び深層状態空間モデルによる紫外線からの肌ダメージ経時変化予測基盤の構築

    小島 要, 山崎 研志

    提供機関:Japan Society for the Promotion of Science

    制度名:Grants-in-Aid for Scientific Research

    研究種目:Grant-in-Aid for Scientific Research (C)

    研究機関:Tohoku University

    2020年4月 ~ 2023年3月

    詳細を見る 詳細を閉じる

    昨年度まで紫外線写真からの色素斑の同定による肌ダメージの測定を目的として、可視光下で撮影の写真から紫外線写真を生成する深層学習モデルの開発を進めていた。昨年度までに開発の手法では、医療機関等に設置された専用の撮影装置により取得された写真を想定していたため、スマートフォン等の一般のデジタルカメラで撮影された写真からは生成される紫外線写真の精度が落ちてしまう問題があった。そこで今年度は、スマートフォン等で撮影された写真からも高い精度での紫外線写真の生成を可能とする手法の開発を進めている。本手法の開発では、スマートフォン等により様々な条件下で撮影された画像データが精度向上において必要となるため、東北大学皮膚科学分野所属の皮膚科専門医らと共同で画像データの収集を進めている。また、昨年度までに開発の手法では学習に必要な画像データが可視光下で撮影の写真と紫外線写真のペアとなっている必要があるが、スマートフォン等で撮影された画像において同様の形態の画像データの収集は困難である。そこで、画像データセットが変換前と変換先の画像のペアとなっていない場合にも画像変換の学習が可能な敵対生成ネットワーク手法CycleGANの枠組みで学習を行うように手法の開発を進めている。また、深層学習モデルも自身もこれまで用いていたU-netだけでなくSelf-attention GANにおいて用いられるより新しいモデルの実装を進め、精度の比較検討を行っている。

  5. 原発性胆汁性胆管炎の発症と重症化機構解明のためのGWASを基盤とした統合解析

    中村 稔, 川嶋 実苗, 人見 祐基, 下田 慎治, 安波 道郎, 小島 要

    提供機関:Japan Society for the Promotion of Science

    制度名:Grants-in-Aid for Scientific Research

    研究種目:Grant-in-Aid for Scientific Research (B)

    研究機関:Department of Clinical Research, National Hospital Organization Nagasaki Medical Center

    2017年4月 ~ 2020年3月

    詳細を見る 詳細を閉じる

    2010年10 月から日本人PBC のゲノムワイド関連解析 (GWAS)の全国共同研究を開始し、2020年3月までにPBC 2,049 症例とコントロール1,985 例のGWAS 解析から、日本人で有意(P<5.0x10-8))な疾患感受性遺伝子領域を8 ヶ所同定した 。各々の疾患感受性遺伝子領域の中に複数存在する一塩基多型(SNP)の中から、実際にPBC 発症に関与するcausal variantとcausal variant によって遺伝子の発現変化が生じるeffector gene の同定方法も確立し、疾患感受性遺伝子によるPBCの疾患発症の分子機構の一端を明らかとした。

  6. ゲノム情報からのマイクロサテライト統合解析基盤構築による網羅的な疾患関連因子同定

    小島 要

    提供機関:Japan Society for the Promotion of Science

    制度名:Grants-in-Aid for Scientific Research

    研究種目:Grant-in-Aid for Young Scientists (B)

    研究機関:Tohoku University

    2017年4月 ~ 2019年3月

    詳細を見る 詳細を閉じる

    これまでSNPアレイデータからの単塩基変異を主とした関連解析が多くなされている一方マイクロサテライトのリピート数多型に関する研究は限られたものとなっているが、これは既存のマイクロサテライトにおける解析手法の精度が原因と考えられる。本研究では、シークエンスデータからのマイクロサテライトにおけるリピート数推定法と遺伝子系図情報を考慮したリピート数のインピュテーション手法を開発した。さらに開発手法の日本人ゲノムデータへの適用を目的とした日本人全ゲノムシークエンスデータからのマイクロサテライト多型情報の整備と関連解析において重要となる集団構造解析などの知見を考慮した解析パイプラインの構築を行った。

  7. エピゲノム及び遺伝子による細胞内制御ネットワークモデリングと細胞分化機構の解析

    小島 要

    提供機関:Japan Society for the Promotion of Science

    制度名:Grants-in-Aid for Scientific Research

    研究種目:Grant-in-Aid for Young Scientists (B)

    研究機関:Tohoku University

    2013年4月 ~ 2015年3月

    詳細を見る 詳細を閉じる

    細胞内システムの解析において、遺伝子発現量のみを考慮した制御関係ネットワークが考えられてきたが、細胞分化の過程において遺伝子だけでなくDNAメチル化やヒストン修飾等のエピゲノム状態の変化により、制御関係が変化し、異なる機能を持つ細胞へ分化すると考えられている。本研究では、遺伝子とエピゲノムによる制御ネットワークの解析に用いられる、遺伝子発現量の高精度な推定手法の開発、並びにエピゲノム状態推定の高精度化のための基盤手法の開発を行う。さらに、細胞内制御ネットワークの解析を目的として、マウスの細胞分化過程における遺伝子発現量データならびにヒストン修飾データからの転写因子の結合予測手法の開発を行う。

  8. 大規模遺伝子ネットワークに適した階層型グリッドレイアウトアルゴリズムの研究

    小島 要

    2008年 ~ 2010年

    詳細を見る 詳細を閉じる

    本年度は次の4つの成果を得た. (1)生物ネットワークの可視化では,見た目の美しさだけでなく,生物学的な理解が目的であることから,各生体内分子の細胞内局在情報を考慮することが重要であるが,力学的ばねモデルでは,トーラス構造状の細胞内局在情報を扱うことが難しい問題があった.そこで,グリッドレイアウトに力学的ばねモデルの損失関数を導入することでトーラス構造等の複雑な細胞内局在情報を容易に扱うことが可能なレイアウト手法を開発した. (2)ベイジアンネットワークによる遺伝子ネットワーク推定では,計算効率化のため,遺伝子群を選び出し,推定される.条件付き独立性の検定によりベイジアンネットワークの構造推定を行う際,遺伝子群から欠落した遺伝子からの制御により,検定結果に矛盾が生じる問題があった。そこで,検定結果に矛盾が生じた場所から欠落遺伝子の存在を同定する手法を提案した. (3)同じパスウェイや機能に属するタンパク質をコードする遺伝子同士は制御関係が密であり,逆の場合,疎であることが想定される.そこで,密に制御関係で連結された遺伝子群をモジュールと定義し,HofmanとWigginsの統計的モデリングによるグラフからのモジュール推定法とベクトル自己回帰モデルを変分ベイズ法の枠組みから階層型モデリングを行うことで,時系列遺伝子発現データから遺伝子モジュールと高精度な遺伝子ネットワークを同時に推定する手法を提案した. (4)時系列遺伝子発現データからの遺伝子群間のGranger因果律を正準相関分析により同定する場合,Grander因果律の検出においてブートストラップ法によりp値が計算されていたため計算コストがかかり,また高精度にp値を求めることが困難であった.そこで,Bartlett補正により尤度比検定の結果を補正することで少サンプルデータの場合でも正確なp値を高速に計算する手法を提案した.

︎全件表示 ︎最初の5件までを表示