In Silico Analysis of Regulatory Elements of the Vitamin D Receptor

Vitamin D receptor (VDR) is a nuclear transcription factor that controls gene expression. Its impaired expression was found to be related to different diseases. VDR also acts as a regulator of different pathways including differentiation, inflammation, calcium and phosphate absorption, etc. but there is no sufficient knowledge about the regulation of the gene itself. Therefore, a better understanding of the genetic and epigenetic factors regulating the VDR may facilitate the improvement of strategies for the prevention and treatment of diseases associated with dysregulation of VDR. In the present investigation, a set of databases and methods were used to identify putative functional elements in the VDR locus. Histone modifications, CpG Islands, epigenetic marks at VDR locus were indicated. In addition, repeated sequences, enhancers, insulators, transcription factor binding sites and targets of the VDR gene, as well as proteinprotein interactions with bioinformatics tools, were reported. Some of these genetic elements had overlapped with CpG Islands. These results revealed important new insight into the molecular mechanisms of the VDR gene regulation in human cells and tissues.

VDR silencing may result from methylation of CpG island or genetic variation of VDR promoter (13). Polymorphisms in the VDR gene, such as the restriction site of the Taql and Bsml polymorphisms have been associated with the different inflammatory diseases and could affect individual responses to vitamin D treatment (14). Moreover, the VDR promoter has a GC-rich region from −790 bp to 380 bp (15). However, disclosing the genetic variation and CpG island methylation complexity of the VDR gene remains challenging. This is partly because of the lack of a perfect understanding of the cis-regulatory elements that control gene expression. The present study investigated some properties of the VDR gene sequence and introduced some mechanisms that may control the expression of this gene since the comprehensive study of the genetic and epigenetic features of the human VDR gene may uncover new aspects of its regulation. In addition, simultaneous investigation of several databanks and approaches will provide a better comprehensive understanding of the VDR gene, and will help improve the clinical management, treatment, and responses to vitamin D.

Materials and Methods
In this investigation, the desired locus is the Homo sapiens VDR gene (NM_000376) that is located on 12q13-14 and is composed of 11 exons (16). Some databases were used for identifying regulatory elements in this VDR gene. The databases and software used for analyzing the VDR locus were presented in Table 1.

Analysis of CpG Islands
Analyzing CpG islands by the UCSC browser indicated the presence of 4 CpG islands in the VDR. CpG Islands 71 and 77 have a length greater than 300 bp, while, CpG Islands 24, and 16 have length of less than 300 bp (Fig. 1). Further analysis performed by "bona fide" represented that two CpG Islands on (1060 and 1062,) VDR gene overlapped with CpG Island 71 predicted by the UCSC browser (Fig. 1).
In addition, analysis of the VDR gene using "Weizmann Evolutionary CpG islands" pointed five different CpG Islands that overlap with exons except for CpG Island 0.9 (Fig. 1). Moreover, the "CpGProD" program detects six putative CpG Islands (CpG1, CpG2, CpG3, CpG4, CpG5, and CpG6) that all have length more than 500 bp but they have lower CpGo/e ratio compared with UCSC browser (Fig. 1).

Analysis of CpG Methylation
There is a remarkable homology between certain DNA properties of CpG islands and their epigenetic statues, which were used to score all CpG islands according to their strength. High CpG island strength is indicated by the absence of DNA methylation, frequent promoter activity and open chromatin structure (16). Based on what was presented, we found one bona fide CpG island with a combined epigenetic score > 0.67 (CpG Island 1062) and another with a score > 0.33 (CpG Island 1060) ( Fig. 1). In addition, CCGC motif analysis shown that there are several CCGC motifs within CpG Island 1062 (http://genome.ucsc.edu/cgibin/hgTracks?db=hg18 &lastVirtModeType=default&lastVirtModeExtraSt ate=&virtModeType=default&virtMode=0&nonVir tPosition=&position=chr12%3A465215874658508 1&hgsid=603289241_vDNe68Zs8Ty78FrJURMU1 aMldoYD).

Histone Modifications analysis
The presence of Histone Marks (H3K4Me1, H3K4Me3, and H3K27m3) within the VDR gene sequence was investigated to analysis histone methylation. The H3K4me1 histone mark was the mono-methylation of lysine 4 of the H3 histone protein, and it is associated with enhancers in addition to DNA regions downstream of transcription starts (http://genome.ucsc.edu/cgibin/hgTrackUi?hgsid=6 03289241_vDNe68Zs8Ty78FrJURMU1aMldoYD &c=chr12&g=wgEncodeRegMarkEnhH3k4me1).
The H3K4Me3 histone mark was the trimethylation of lysine 4 of the H3 histone protein, and it is associated with promoters that are active or poised to be activated. Results revealed the presence of regions that had H3K4Me1, H3K4Me3, and H3K27m3 in different cell lines of the ENCODE group. These regions were mostly overlapped with the promoter, CpG Island 1062 and the other CpG islands (Fig. 2). Also, studying the acetylation of H3K27 and H3K9 obtained using ChIP-seq by ENCODE project indicated that H3K27ac, its promoter as well as CpG Islands were in VDR exon 1, while H3K9ac was in exons 1, 3, 9 and introns 2, 8 (Fig. 2). Investigation of different classes of repeats UCSC browser (the RepeatMasker software) were used for studying repeated sequences in VDR locus. Results showed that there were different classes of repeats include several long interspersed nuclear elements (LINE) that overlap with CpG Island 1060. In addition, many short interspersed nuclear elements (SINE), mostly family Alu, a few long terminal repeat elements (LTR) were found which include retroposons, some DNA repeat elements (13), several simple repeats (microsatellites), few low complexity repeats ( Supplementary Fig. 1). It was found that none of these repeats overlaps with any CpG Island. Moreover, tandem repeats were analyzed by ''Sequence-based Estimation of Repeat Variability'' and indicated that there are over ten tandem repeats in VDR locus (supplementary Table 1).

In silico Transcription Factor Binding Sites analysis
PReMode and CisRed databases were used for analyzing Transcription Factor Binding Sites. PReMode database describes more than 100,000 computational predicted cis-regulatory modules within the human genome (17). The results of PReMode indicated that there were four cisregulatory modules (transcription factor binding sites) with different lengths in VDR locus. These modules have binding sites for some transcription factors. However, one module was overlaped with the CpG Islands in the VDR gene (Fig. 3a). For instance, one of these modules has been presented in Figure 3b. Also, the CisRed database identified two conserved DNA sequence motifs in the VDR gene (in promoter and exon 1, Fig. 3c). Transcription factors regulating VDR and transcriptional targets VDR downstream transcriptional targets and VDR regulators were investigated from TRRUST v2 (Table 2). Those were activators and repressors but some of them have unknown functions. TRRUST v2 (Transcriptional Regulatory Relationships Unraveled by Sentence based Text mining) is a database consisting of 8444 regulatory interactions for 800 TFs in humans (18). Also, a functional network of VDR target genes and a protein-protein interaction network of VDR regulator genes were indicated in supplementary Figure 2 (http://www.grnpedia.org/trrust/). Furthermore, Struct2Net was used for finding VDR protein-protein interactions (Structurebased computational predictions of protein-protein interactions). This algorithm used logistic regression and computed a single score (between 0 and 1). A score of 0 indicated minimal confidence in the possibility of an interaction between the two proteins while a score of 1 indicated maximum confidence (19). It was found that a total of 15 predicted protein-protein interactions, were not yet experimentally observed (supplementary Table 2).

Identifying transcriptional enhancers
VDR transcriptional enhancers were investigated by EnhancerAtlas. The present results showed that there were several enhancers affecting the VDR gene. Some of these enhancers were located near transcription start sites (TSSs) of the VDR gene, while the others were at significant distances from the gene's TSS. The number of these enhancers in different kinds of cells was more than three enhancers (supplementary Fig. 3).

Searching CTCF sites in VDR locus
The UCSC Browser (ENCODE) and the "CTCFBSDB" database were used for finding CCCTC-binding factor (CTCF) binding sites. Results indicated the presence of eleven CTCF binding sites in the VDR gene from ENCODE. Some of these binding sites were located in CpGI1, CpGI3, CpGI5, CpGI6 identified by CpG ProD, and mostly overlapped with CpG1062 as predicted by Bona Fide. In addition, some of the CTCF binding sites overlap with the VDR promoter (Fig. 1). However, only one CTCF binding site, in the VDR gene located in the nucleotide position 9491, was identified in the VDR gene using the "CTCFBSDB" database was identified. The sequence of this insulator was AGTACCAGCAGGTGGCACAC.

Discussion:
Recent studies have shown that the biological actions of vitamin D are mediated through VDR (20). Unfortunately, precise control and molecular mechanisms of this receptor expression are still not fully understood. In this study databases and in silico tools were used for identifying putative regulatory elements and gene specifications that might mediate regulation of VDR. The computational identification of these properties and elements presented new insights into the understanding of the control of human VDR expression and its association with health and disease. This bioinformatics analysis supports the idea that different genetic and epigenetic elements are involved in the regulation of the activity and expression of the VDR gene.
An in silico study reported by Halsall et al. (21) shown that exon 1a in VDR was within a strong CpG island and thus, transcription from exon 1a may be regulated by methylation. Marik et al. (22) found that some CpGIs are in a region from −790 bp to +380 bp downstream of the VDR transcription start site. They observed that these CpG islands were methylated in breast cancer. In addition, promoter methylation of the VDR gene was found in adrenocortical carcinoma (13). It was not in parathyroid adenomatous tissues (23).
This investigation showed that there were additional CpG islands in the VDR gene. The association of these CpG Islands with other information including transcription factor binding sites, histone modifications, enhancers, and genomic insulators was studied. In the other words, because of the overlapping regulatory elements with CpG islands at VDR locus, DNA methylation could be modulated TF binding sites at the VDR gene, interacts with histone modification of VDR promoter and controls enhancers/insulators (24). In addition, because in this study, it was found that some repeated elements at VDR locus overlap with CpG Island, in the other hand, number of these repeats are variable in different individuals (25), so DNA methylation events of VDR promoter are variable, and this could be used to personalized medicine (26).
The present results demonstrated that histone marks overlapped with CpG Island at VDR locus. The number and distribution patterns of histone modifications regulate the expression of genes. The overlapping two epigenetic events at VDR locus may increase, decrease or counteract the effects of each other that further investigations are needed to study these effects.
A comprehensive realization of the actions of vitamin D and its receptor VDR needs analysis of VDR binding sites. The VDR recognizes a specific DNA element comprised of two hexameric nucleotide half-sites separated by three base pairs (27). By studying the common VDR binding site (VDRE) in the regulatory sequences of VDR targets, the results indicated the presence of 50 proteins that may be targeted for VDR. According to these results and the other researches (28,29), VDR activates some genes and represses others, but the effects of VDR on some genes are unknown. Therefore, functional studies need to investigate the action of VDR precisely and further. Furthermore, it was presented that VDR is a target for many transcription factors. In these cases, too, their effects on VDR expression are unknown. Thus, the roles of VDR in this regulatory network should be examined.
Through the investigation of the regulatory elements such as enhancers and insulators on VDR locus. Several enhancers located at distances from the VDR promoter region were found. The previous studies suggest that regulatory enhancers around promoters may be more usual than once believed and is presenting the important role of chromatin looping and chromatin rearrangement in transcription control (30). In addition, modification at some of these enhancer regions may be significant to following changes in the level of VDR transcription (31). The precise function of some enhancer regions at VDR locus has not yet been determined due to their unknown existence in several human cells and tissues.
The results also revealed that a few insulators in VDR locus, located in VDR CpG islands, overlapped with histone modifications. CTCF is a factor that plays important roles in genomic processes including imprinting, transcription, chromatin rearrangements and chromatin interactions (32). CTCF be able to bind to a wide range of sequences and regulate expression of genes via its functions as a repressor or activator. In addition, its binding site has an insulator role (33). Wang et al. (34) reported the presence of several CpGs at the CTCF consensus motif in different cell types. They also showed that methylation is indeed a global feature of the regulatory diversity of CTCF. Therefore, our results show crosstalk among CTCF binding sites (insulator), CpG methylations and histone modifications that may affect VDR gene regulation and these effects could be varied in different cell types and tissues.
In Conclusion, these findings from the bioinformatics sequence analyses and databases showed that there is crosstalk among specific regulatory motifs, various set of CGIs methylation, transcription factors and histone modifications which have combinatorial influences on VDR gene expression. These can provide targets for further and more precise functional analysis, even a promising therapeutic strategy for different diseases. However, these regulations should be examined experimentally to help disease treatment and prevention. Finally, our data provide new insight into the action of VDR as a modular component in regulatory pathways that control the different processes.