python命令

字符串替换

print(ss.replace(‘\n’, ‘’))

python 运行shell 命令

import subprocess
subprocess.call(command, shell=True)

python判断字符串(string)是否包含(contains)子字符串的方法

string in list
isin()

切片

data[[x,xx,x]] # 取某几行
data.loc # 具体的行列名字
data.iloc[:,1:2]

python paste 命令

paste strings

datause.columns=[“COL01.” + str(i) for i in datause.columns]

[“s” + str(i) for i in xrange(1,11)]

list(map(‘s{}’.format, range(1, 11)))

map(lambda x:”s”+str(x), range(1,11))

paste two columns

anno[‘day’]=anno[‘day’].astype(“str”)
anno[‘louvain13’]=anno[‘louvain13’].astype(“str”)
anno[‘day_cluster’]=anno[‘day’].astype(str)+”_”
anno[‘day_cluster’]=anno[‘day_cluster’].astype(str)+anno[‘louvain13’]

paste two columns

求相关性

数据库直接求皮尔斯相关性

1
2
3
4
5
datause1=datause_gene.to_numpy()
pcc_gene=np.corrcoef(datause1)
pcc_gene=pd.DataFrame(pcc_gene)
pcc_gene.index=datause_gene.index.values
pcc_gene.columns =datause_gene.index.values

按组求平均

1
2
3
4
5
df = pd.DataFrame([['a', 'man', 120, 90],
['b', 'woman', 130, 100],
['a', 'man', 110, 108],
['a', 'woman', 120, 118]], columns=['level', 'gender', 'math','chinese'])
group = df.groupby('gender').mean()

列和

colsum=mtx_tf.apply(lambda x:x.sum())
rowsum=mtx_tf.apply(lambda x:x.sum(),axis=1)

unique 命令必须是pd.Series才能使用。

1
2
3
4
5
6
zz=barcode_list3.values()
zz=list(zz)
df = pd.Series( (v for v in zz) )
len(df.unique())
uu=df.unique()
pd.DataFrame(uu).to_csv("barcode3_96.csv")

read pickle文件

1
2
3
RT_bc3 = open("/media/ggj/home/ggj/Documents/data/fei/Rscripts/MW_script/barcode3_96_bc.pickle2","rb")
barcode_list3 = pickle.load(RT_bc3)
RT_bc3.close()