1. Clean the Wikipedia and Twitter corpus. type: python ./CleanAndRBO/clean_colecction/clean_tweet_text.py

2. Using Word2Vec generating wiki_vectors and tweet_vectors respectively. 
As for derived word2vectors, the first line is the word count, sometimes there is </s> included, delete them. 
type:
run: python ./Vector/Corpus/AfterDeleteFirstLine/deleteFirstLine.py
generate: ./Vector/Corpus/AfterDeleteFirstLine/tweet_text0_vectors_cleaned.txt

3. buildTransformation.sh train the tranformate matrix from one matrix to another by using top 1000 words from both corpus
It will generate new vectors after multiplying tranformation
input: word vectors from source and target and top 1000 words
generate: Vector/TransformateMatrix/Twitter/tm_Twitter_t2w.txt


4.<Notice> Get the sample matrix based on common words. Extract sub-matrix from raw word2vectors.
run: bash Vector/TransmatCorpus/Vector/Vectors/ExtractTweetVectors.py
generate: ./Vector/AfterTM/sorted_afterTranformationFrom0Base0Matirx.vecs.txt


5. Use the second step's transmat matrix to get the new matrix for old vectors:

	./Vector/AfterTM/sortTMMatrixAndCaculateSimilarity_t2w.sh
	a. sort the transmated matrix according to the common words
	generate ./Vector/AfterTM/afterTranformationFrom1Base0Matirx.vecs.txt
	./Vector/AfterTM/afterTranformationFrom1Base0Matirx.wds.txt
	b. Calculate the similarity between transmated matrix and sample matrix.
	c. Plot the similarity
	d. PCA would reduce the dimension of orignal and trasformed vectors to 2-d. It will show whether above methods work or not.