Dear All,
I want to calculate the statistical significance for finding a DNA motif in the promoter sequence of specific length. for example:
The motif "ATCGAT" is occuring 5 times in the 2000bp promoter. So what kind of statistical test can be done for finding the enrichment of this motif in the promoter sequence?

A simple calculation for a number of non-unique DNA k-mers is 4 raised to the power of k (4^k). That means there are 256 non-unique tetramers, 1024 for pentamers and 4096 for hexamers. Statistically speaking, any given hexamer would be expected to occur once in 4096 nucleotides, so 5 in 2000 is statistically significant.

It is a different question whether it is biologically significant. Your motif is short, which is usually the case with eukaryotic TFs. Yet your motif is a palindrome, which is usually not the case with eukaryotic TFs. All that and a neat 2000 bp promoter size sounds like a made-up example rather than being real, so I think this might be a homework. I will let you figure out the rest.