高血压数据分析
2018-08-24 11:11:58 0 举报
AI智能生成
数据
作者其他创作
大纲/内容
8.24
基础参数问题
1.1 体检表上为左右双臂血压值,基于此生成用于分析的血压值,计算方式是如有双侧记录取均值,如果为单侧记录则以该记录为数据分析的血压值;
1.2 血压值缺失:缺失率为?把缺失血压值的人群从样本里删除即可
1.3 血压值的分布:SBP无奇异值,DBP中有89例为850,把89例删除
1.4 最终的样本量:原有87121例-缺失血压值?例-有奇异值89例=?
1.5 血压控制的定义:收缩压小于140且舒张压小于90
1.2 血压值缺失:缺失率为?把缺失血压值的人群从样本里删除即可
1.3 血压值的分布:SBP无奇异值,DBP中有89例为850,把89例删除
1.4 最终的样本量:原有87121例-缺失血压值?例-有奇异值89例=?
1.5 血压控制的定义:收缩压小于140且舒张压小于90
1.1
代码
sbp
def get_blood_pressure_sbp(row):
import numpy as np
return np.mean([row['blood_pressure_right_sbp'], row['blood_pressure_left_sbp'] or row['blood_pressure_right_sbp']])
import numpy as np
return np.mean([row['blood_pressure_right_sbp'], row['blood_pressure_left_sbp'] or row['blood_pressure_right_sbp']])
df_35p_1['blood_pressure_sbp'] = df_35p_1.apply(get_blood_pressure_sbp, axis=1)
dbp
def get_blood_pressure_dbp(row):
import numpy as np
return np.mean([row['blood_pressure_right_dbp'], row['blood_pressure_left_dbp'] or row['blood_pressure_right_dbp']])
import numpy as np
return np.mean([row['blood_pressure_right_dbp'], row['blood_pressure_left_dbp'] or row['blood_pressure_right_dbp']])
df_35p_1['blood_pressure_dbp'] = df_35p_1.apply(get_blood_pressure_dbp, axis=1)
1.2
均血压不为空值有85259人
代码
df_35p_1 = df_35p[df_35p['blood_pressure_right_sbp'].isnull() == False]
df_35p_1.count()['id']
1.3
筛选过后人数更新为85170
代码
df_35p_2 = df_35p_1[df_35p_1['blood_pressure_dbp'] < 200]
df_35p_2.count()['id']
最终结果
总人数
87121
筛选过后人数
85170
差数
1951
8.25
高血压药物使用
2.1 标注高血压药物和非高血压药物:Excel中1为高血压药物,0为非高血压
药物;
2.2有?人服药(四种药物记录,任有1次用药名称即认为服药)
2.3有?人服用高血压药物(四种药物记录,任有1次用药名称为高血压药名即认为服高血压药)
2.4 服药的方式定义
规律服用高血压药物:四种药物记录,任有1次用药名称为高血压药名且勾选“规律”;有多少人?
间断服用高血压药物:四种药物记录,任有1次用药名称为高血压药名且勾选“间断”且排除任何一次规律服高血压药;有多少人?不服高血压药:四种药物记录,没有高血压药名或者任1次填写了高血压
药名但勾选了“不服用”或者填写了高血压药名,但每个都勾选了“不服
用”有多少人?
2.5 联合用药的情况(各组有多少)
1种高血压药物定义:四种药物记录,只有1种用药名称为高血压药
2种高血压药物定义:四种药物记录,只有2种用药名称为高血压药
2种以上高血压药物定义:四种药物记录,只有2种以上用药名称为高血 压药
药物;
2.2有?人服药(四种药物记录,任有1次用药名称即认为服药)
2.3有?人服用高血压药物(四种药物记录,任有1次用药名称为高血压药名即认为服高血压药)
2.4 服药的方式定义
规律服用高血压药物:四种药物记录,任有1次用药名称为高血压药名且勾选“规律”;有多少人?
间断服用高血压药物:四种药物记录,任有1次用药名称为高血压药名且勾选“间断”且排除任何一次规律服高血压药;有多少人?不服高血压药:四种药物记录,没有高血压药名或者任1次填写了高血压
药名但勾选了“不服用”或者填写了高血压药名,但每个都勾选了“不服
用”有多少人?
2.5 联合用药的情况(各组有多少)
1种高血压药物定义:四种药物记录,只有1种用药名称为高血压药
2种高血压药物定义:四种药物记录,只有2种用药名称为高血压药
2种以上高血压药物定义:四种药物记录,只有2种以上用药名称为高血 压药
2.2
有服药记录的人群
75123
代码
df_35p_2[(df_35p_2['take_medicine_1'].isnull() == False) | (df_35p_2['take_medicine_2'].isnull() == False) | (df_35p_2['take_medicine_3'].isnull() == False) | (df_35p_2['take_medicine_4'].isnull()== False)].count()['id']
2.3
服用高血压的人群
43086
代码
if row['take_medicine_1'] in medicine_list or row['take_medicine_2'] in medicine_list or row['take_medicine_3'] in medicine_list or row['take_medicine_4'] in medicine_list:
return True
else:
return False
return True
else:
return False
其中is_hy_m为服药的药物是高血压的字段
2.4
规律服用高血压药物
36200
代码
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'规律'):
return True
else:
return False
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'规律'):
return True
else:
return False
2.4
间断服用高血压药物
5580
代码
df_bgl = df_35p_2[df_35p_2['is_hy_m'] == False]
df_bgl['is_hy_m_1'] = df_bgl.apply(is_hy_medicine_1, axis=1)
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'间断'):
return True
else:
return False
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'间断'):
return True
else:
return False
思路
筛去填写规律的
再筛符合服药高血压药物为间断
2.4
不服高血压药
42084
代码
2.5
代码
count = 0
if row['take_medicine_1'] in medicine_list:
count += 1
if row['take_medicine_2'] in medicine_list:
count += 1
if row['take_medicine_3'] in medicine_list:
count += 1
if row['take_medicine_4'] in medicine_list:
count += 1
if row['take_medicine_1'] in medicine_list:
count += 1
if row['take_medicine_2'] in medicine_list:
count += 1
if row['take_medicine_3'] in medicine_list:
count += 1
if row['take_medicine_4'] in medicine_list:
count += 1
1
38044
df_35p_2[df_35p_2['is_hy_m_4'] == 1].count()['id']
2
4782
df_35p_2[df_35p_2['is_hy_m_4'] == 2].count()['id']
>2
260
df_35p_2[df_35p_2['is_hy_m_4'] >2].count()['id']
8.27
总人群
count
75123
par
df_35p_2
服药人群
count
43086
par
df_fy_h
code
if row['take_medicine_1'] in medicine_list or row['take_medicine_2'] in medicine_list or row['take_medicine_3'] in medicine_list or row['take_medicine_4'] in medicine_list:
return True
else:
return False
return True
else:
return False
服用高血压人群
count
75123
par
df_fy
code
df_35p_2[(df_35p_2['take_medicine_1'].isnull() == False)
| (df_35p_2['take_medicine_2'].isnull() == False)
| (df_35p_2['take_medicine_3'].isnull() == False)
| (df_35p_2['take_medicine_4'].isnull() == False)]
| (df_35p_2['take_medicine_2'].isnull() == False)
| (df_35p_2['take_medicine_3'].isnull() == False)
| (df_35p_2['take_medicine_4'].isnull() == False)]
服药方式
规律服用高血压药
count
36200
par
df_fy_h_gl
code
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'规律'):
return True
else:
return False
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'规律') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'规律'):
return True
else:
return False
间断服用高血压药
count
5580
par
df_fy_h_jd
code
df_bgl = df_35p_2[df_35p_2['is_hy_m'] == False]
df_bgl['is_hy_m_1'] = df_bgl.apply(is_hy_medicine_1, axis=1)
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'间断'):
return True
else:
return False
or (row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] == u'间断') \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] == u'间断'):
return True
else:
return False
服用高血压药物但方式未知
count
1306
par
df_fy_h_other
code
def is_hy_medicine_normal(row):
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] != None) \
or(row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] != None) \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] != None) \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] != None):
return True
else:
return False
if (row['take_medicine_1'] in medicine_list and row['take_medicine_1_take_medicine_compliance'] != None) \
or(row['take_medicine_2'] in medicine_list and row['take_medicine_2_take_medicine_compliance'] != None) \
or (row['take_medicine_3'] in medicine_list and row['take_medicine_3_take_medicine_compliance'] != None) \
or (row['take_medicine_4'] in medicine_list and row['take_medicine_4_take_medicine_compliance'] != None):
return True
else:
return False
df_fy_h['is_hy_m_normal'] = df_fy_h.apply(is_hy_medicine_normal, axis=1)
df_fy_h_normal = df_fy_h[df_fy_h['is_hy_m_normal'] == True]
df_fy_h_other = df_fy_h[df_fy_h['is_hy_m_normal'] != True]
不服高血压药
count
42084
par
df_bfy_h
code
服药类型
高血压中药
count
1568
par
df_h_z
code
medicine_x_list = [u'卡托普利', u'卡托普利片', u'硝苯地平', u'硝苯地平缓释片', u'硝苯地平片',
u'硝苯地平缓释片1', u'硝苯地平缓释片2', u'硝苯地平1', u'硝苯地平控释片', u'硝苯地平I',
u'硝苯地平缓释11', u'缓释硝苯地平片', u'硝苯地平缓释', u'利血平', u'复方利血平片', u'复方利血平',
u'复方利血平氨苯蝶啶片', u'复方复方利血平片', u'复方利血平氨苯蝶啶', u'复方利血平氮苯蹀啶片',
u'博苏', u'缬沙坦', u'缬沙坦胶囊', u'络活喜', u'寿比山', u'尼福达', u'欣康', u'长托普力',
u'氨氯地平片', u'苯磺酸氨氯地平片', u'马来酸左旋氨氯地平片', u'氢氯噻嗪', u'氯沙坦钾氢氯噻嗪',
u'酒石酸美托洛尔片', u'尼群地平片', u'替米沙坦', u'替米沙坦片', u'替米沙坦胶囊', u'洛丁兴',
u'托拉塞米', u'硝本地平', u'尼莫地平', u'尼莫地平片', u'欣盖达', u'依那普利', u'马来酸依那普利',
u'马来酸依那普利片', u'依那普利片', u'马来酸依那普利片', u'单硝酸异山梨酯', u'吲达胺怕',
u'苯磺酸左旋氨氯地平', u'苯磺酸氨氯地平', u'马拉松左旋氨氯地平', u'倍洛他克', u'吲达帕胺',
u'吲达帕胺片', u'吲达帕胺缓释片', u'厄贝沙坦', u'厄贝沙坦片', u'倍他乐克', u'伲福达', u'引达柏胺',
u'氨氯地平', u'富马酸比索洛尔片', u'倍他乐克片', u'美托洛尔', u'吲达帕安', u'倍他洛克',
u'复方降压灵', u'复方降压片', u'非洛地平', u'非洛地平片', u'尼群地平', u'复方降压胶囊', u'复方降压',
u'降压胶囊', u'降压O号', u'北京降压0号', u'硝本地片', u'倍尼地平',
u'引达帕胺片', u'0号降压片', u'降压零号', u'吲哒帕胺片', u'降压0号', u'硝笨地平']
u'硝苯地平缓释片1', u'硝苯地平缓释片2', u'硝苯地平1', u'硝苯地平控释片', u'硝苯地平I',
u'硝苯地平缓释11', u'缓释硝苯地平片', u'硝苯地平缓释', u'利血平', u'复方利血平片', u'复方利血平',
u'复方利血平氨苯蝶啶片', u'复方复方利血平片', u'复方利血平氨苯蝶啶', u'复方利血平氮苯蹀啶片',
u'博苏', u'缬沙坦', u'缬沙坦胶囊', u'络活喜', u'寿比山', u'尼福达', u'欣康', u'长托普力',
u'氨氯地平片', u'苯磺酸氨氯地平片', u'马来酸左旋氨氯地平片', u'氢氯噻嗪', u'氯沙坦钾氢氯噻嗪',
u'酒石酸美托洛尔片', u'尼群地平片', u'替米沙坦', u'替米沙坦片', u'替米沙坦胶囊', u'洛丁兴',
u'托拉塞米', u'硝本地平', u'尼莫地平', u'尼莫地平片', u'欣盖达', u'依那普利', u'马来酸依那普利',
u'马来酸依那普利片', u'依那普利片', u'马来酸依那普利片', u'单硝酸异山梨酯', u'吲达胺怕',
u'苯磺酸左旋氨氯地平', u'苯磺酸氨氯地平', u'马拉松左旋氨氯地平', u'倍洛他克', u'吲达帕胺',
u'吲达帕胺片', u'吲达帕胺缓释片', u'厄贝沙坦', u'厄贝沙坦片', u'倍他乐克', u'伲福达', u'引达柏胺',
u'氨氯地平', u'富马酸比索洛尔片', u'倍他乐克片', u'美托洛尔', u'吲达帕安', u'倍他洛克',
u'复方降压灵', u'复方降压片', u'非洛地平', u'非洛地平片', u'尼群地平', u'复方降压胶囊', u'复方降压',
u'降压胶囊', u'降压O号', u'北京降压0号', u'硝本地片', u'倍尼地平',
u'引达帕胺片', u'0号降压片', u'降压零号', u'吲哒帕胺片', u'降压0号', u'硝笨地平']
def is_hy_medicine_z(row):
if (row['take_medicine_1'] not in medicine_x_list) \
and (row['take_medicine_2'] not in medicine_x_list) \
and (row['take_medicine_3'] not in medicine_x_list) \
and (row['take_medicine_4'] not in medicine_x_list):
return True
else:
return False
if (row['take_medicine_1'] not in medicine_x_list) \
and (row['take_medicine_2'] not in medicine_x_list) \
and (row['take_medicine_3'] not in medicine_x_list) \
and (row['take_medicine_4'] not in medicine_x_list):
return True
else:
return False
df_fy_h['is_hy_m_z'] = df_fy_h.apply(is_hy_medicine_z, axis=1)
df_h_z = df_fy_h[df_fy_h['is_hy_m_z'] == True]
高血压西药
count
41174
par
df_h_xi
code
medicine_z_list = [u'珍菊降压片', u'降压片', u'珍菊', u'珍菊降压', u'罗布麻', u'罗布麻片',
u'益肾稳压', u'珍菊降压片', u'罗布麻']
u'益肾稳压', u'珍菊降压片', u'罗布麻']
def is_hy_medicine_x(row):
if (row['take_medicine_1'] not in medicine_z_list) \
and (row['take_medicine_2'] not in medicine_z_list) \
and (row['take_medicine_3'] not in medicine_z_list) \
and (row['take_medicine_4'] not in medicine_z_list):
return True
else:
return False
if (row['take_medicine_1'] not in medicine_z_list) \
and (row['take_medicine_2'] not in medicine_z_list) \
and (row['take_medicine_3'] not in medicine_z_list) \
and (row['take_medicine_4'] not in medicine_z_list):
return True
else:
return False
df_fy_h['is_hy_m_x'] = df_fy_h.apply(is_hy_medicine_x, axis=1)
df_h_xi = df_fy_h[df_fy_h['is_hy_m_x'] == True]
高血压中药西药联合
count
344
par
df_h_zx
code
df_h_zx = df_fy_h[(df_fy_h['is_hy_m_x'] != True) & (df_fy_h['is_hy_m_z'] != True)]
联合用药
1种高血压药物
count
38044
par
df_h_count_1
code
df_h_count_1 = df_35p_2[df_35p_2['is_hy_m_4'] == 1]
2种高血压药物
count
4782
par
df_h_count_2
code
df_h_count_2 = df_35p_2[df_35p_2['is_hy_m_4'] == 2]
2种以上高血压药物
count
260
par
df_h_count_2_p
code
count = 0
if row['take_medicine_1'] in medicine_list:
count += 1
if row['take_medicine_2'] in medicine_list:
count += 1
if row['take_medicine_3'] in medicine_list:
count += 1
if row['take_medicine_4'] in medicine_list:
count += 1
if row['take_medicine_1'] in medicine_list:
count += 1
if row['take_medicine_2'] in medicine_list:
count += 1
if row['take_medicine_3'] in medicine_list:
count += 1
if row['take_medicine_4'] in medicine_list:
count += 1
df_h_count_2_p = df_35p_2[df_35p_2['is_hy_m_4'] >2]
计算参数
def cal_base(df_par, df_all=df_35p_2):
excel_objs = []
# 年龄
total = df_all[df_all[u'age'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([df_all[u'age'].mean(), df_all[u'age'].std()], index=[u'年龄', u'年龄标准差']))
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'年龄 总数', u'比例']))
count, count_par = df_all[(df_all['age'] >= 35) & (df_all[u'age'] <= 44)].count()['id'], \
df_par[(df_par['age'] >= 35) & (df_par[u'age'] <= 44)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'35-44 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 45) & (df_all[u'age'] <= 54)].count()['id'], \
df_par[(df_par['age'] >= 45) & (df_par[u'age'] <= 54)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'45-54 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 55) & (df_all[u'age'] <= 64)].count()['id'], \
df_par[(df_par['age'] >= 55) & (df_par[u'age'] <= 64)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'55-64 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 65) & (df_all[u'age'] <= 74)].count()['id'], \
df_par[(df_par['age'] >= 65) & (df_par[u'age'] <= 74)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'65-74 数目', u'比例']))
count, count_par = df_all[df_all['age'] >= 75].count()[u'id'], df_par[df_par['age'] >= 75].count()[u'id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'>=75 数目', u'比例']))
# 支付方式
total = df_all[df_all[u'payment_way'].isnull() == False].count()['id']
excel_objs.append(
pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'支付方式 总数', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'全自费'].count()['id'], \
df_par[df_par[u'payment_way'] == u'全自费'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'全自费 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'新型农村合作医疗'].count()['id'], \
df_par[df_par[u'payment_way'] == u'新型农村合作医疗'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'新型农村合作医疗 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'城镇居民基本医疗保险'].count()['id'], \
df_par[df_par[u'payment_way'] == u'城镇居民基本医疗保险'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇居民基本医疗保险 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'城镇职工基本医疗保险'].count()['id'], \
df_par[df_par[u'payment_way'] == u'城镇职工基本医疗保险'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇职工基本医疗保险 数目', u'比例']))
count, count_par = df_all[(df_all[u'payment_way'] != u'城镇职工基本医疗保险') & (df_all[u'payment_way'] != u'新型农村合作医疗') & (df_all[u'payment_way'] != u'城镇居民基本医疗保险') & (df_all[u'payment_way'] != u'城镇职工基本医疗保险')].count()['id'], \
df_par[(df_all[u'payment_way'] != u'城镇职工基本医疗保险') & (df_par[u'payment_way'] != u'新型农村合作医疗') & (df_par[u'payment_way'] != u'城镇居民基本医疗保险') & (df_par[u'payment_way'] != u'城镇职工基本医疗保险')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇职工基本医疗保险 数目', u'比例']))
# 文化程度
total = df_all[df_all[u'education'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'文化程度 总数', u'比例']))
count, count_par = df_all[(df_all[u'education'] == u'文盲及半文盲') | (df_all[u'education'] == u'小学') | (df_all[u'education'] == u'初中')].count()['id'], \
df_par[(df_all[u'education'] == u'文盲及半文盲') | (df_par[u'education'] == u'小学') | (df_par[u'education'] == u'初中')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'初中以下 数目', u'比例']))
count, count_par = df_all[df_all[u'education'] == u'高中/技校/中专'].count()['id'], \
df_par[df_par[u'education'] == u'高中/技校/中专'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'高中/技校/中专 数目', u'比例']))
count, count_par = df_all[df_all[u'education'] == u'大学专科及以上'].count()['id'], \
df_par[df_par[u'education'] == u'大学专科及以上'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'大学专科及以上 数目', u'比例']))
count, count_par = df_all[(df_all[u'education'] == u'不详') | (df_all[u'education'] == u'Unknown')].count()['id'], \
df_par[(df_all[u'education'] == u'不详') | (df_par[u'education'] == u'Unknown')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'不详 数目', u'比例']))
# 血型
total = df_all[df_all[u'blood_type'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'血型 总数', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'A型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'A型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'A型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'B型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'B型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'B型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'O型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'O型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'O型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'AB型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'AB型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'AB型 数目', u'比例']))
count, count_par = df_all[(df_all[u'blood_type'] == u'不详') | (df_all[u'blood_type'] == u'Unknown')].count()['id'], \
df_par[(df_all[u'blood_type'] == u'不详') | (df_par[u'blood_type'] == u'Unknown')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'不详 数目', u'比例']))
# 高血压分层
total = df_all[df_all[u'blood_pressure_sbp'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'高血压分层 总数', u'比例']))
count, count_par = df_all[(df_all['blood_pressure_sbp'] < 120) & (df_all['blood_pressure_dbp'] < 80)].count()['id'], \
df_par[(df_par['blood_pressure_sbp'] < 120) & (df_par['blood_pressure_dbp'] < 80)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'理想血压 数目', u'比例']))
count, count_par = df_all[((df_all['blood_pressure_sbp'] >= 120) & (df_all['blood_pressure_sbp'] <= 139)) & ((df_all['blood_pressure_dbp'] >= 80) & (df_all['blood_pressure_dbp'] <= 89))].count()['id'], \
df_par[((df_par['blood_pressure_sbp'] >= 120) & (df_par['blood_pressure_sbp'] <= 139)) & ((df_par['blood_pressure_dbp'] >= 80) & (df_par['blood_pressure_dbp'] <= 89))].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'正常 数目', u'比例']))
count, count_par = df_all[((df_all['blood_pressure_sbp'] >= 140) & (df_all['blood_pressure_sbp'] <= 159)) & ((df_all['blood_pressure_dbp'] >= 90) & (df_all['blood_pressure_dbp'] <= 99))].count()['id'], \
df_par[((df_par['blood_pressure_sbp'] >= 140) & (df_par['blood_pressure_sbp'] <= 159)) & ((df_par['blood_pressure_dbp'] >= 90) & (df_par['blood_pressure_dbp'] <= 99))].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'一级血压 数目', u'比例']))
count, count_par = df_all[(df_all['blood_pressure_sbp'] >= 160) & (df_all['blood_pressure_dbp'] >= 100)].count()['id'], \
df_par[(df_par['blood_pressure_sbp'] >= 160) & (df_par['blood_pressure_dbp'] >= 100)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'二级血压及以上 数目', u'比例']))
# BMI
total = df_all[df_all[u'body_mass_index'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'BMI 总数', u'比例']))
count, count_par = df_all[df_all[u'body_mass_index'] < 24].count()['id'], \
df_par[df_par[u'body_mass_index'] < 24].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'< 24 数目', u'比例']))
count, count_par = df_all[(df_all[u'body_mass_index'] >= 24) & (df_all[u'body_mass_index'] < 28)].count()['id'], \
df_par[(df_all[u'body_mass_index'] >= 24) & (df_all[u'body_mass_index'] < 28)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'24-28 数目', u'比例']))
count, count_par = df_all[df_all[u'body_mass_index'] >= 28].count()['id'], \
df_par[df_par[u'body_mass_index'] >= 28].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'>= 28 数目', u'比例']))
# 空腹血糖
total = df_all[df_all[u'blood_glucose_mmol'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'空腹血糖 总数', u'比例']))
count, count_par = df_all[df_all[u'blood_glucose_mmol'] < 6.1].count()['id'], \
df_par[df_par[u'blood_glucose_mmol'] < 6.1].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'<6.1 数目', u'比例']))
count, count_par = df_all[(df_all[u'blood_glucose_mmol'] >= 6.1) & (df_all[u'blood_glucose_mmol'] < 7.0)].count()['id'], \
df_par[(df_all[u'blood_glucose_mmol'] >= 6.1) & (df_all[u'blood_glucose_mmol'] < 7.0)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'6.1 - 7.0 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_glucose_mmol'] >= 7.0].count()['id'], \
df_par[df_par[u'blood_glucose_mmol'] >= 7.0].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'≥7.0 数目', u'比例']))
return pd.concat(excel_objs)
excel_objs = []
# 年龄
total = df_all[df_all[u'age'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([df_all[u'age'].mean(), df_all[u'age'].std()], index=[u'年龄', u'年龄标准差']))
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'年龄 总数', u'比例']))
count, count_par = df_all[(df_all['age'] >= 35) & (df_all[u'age'] <= 44)].count()['id'], \
df_par[(df_par['age'] >= 35) & (df_par[u'age'] <= 44)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'35-44 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 45) & (df_all[u'age'] <= 54)].count()['id'], \
df_par[(df_par['age'] >= 45) & (df_par[u'age'] <= 54)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'45-54 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 55) & (df_all[u'age'] <= 64)].count()['id'], \
df_par[(df_par['age'] >= 55) & (df_par[u'age'] <= 64)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'55-64 数目', u'比例']))
count, count_par = df_all[(df_all['age'] >= 65) & (df_all[u'age'] <= 74)].count()['id'], \
df_par[(df_par['age'] >= 65) & (df_par[u'age'] <= 74)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'65-74 数目', u'比例']))
count, count_par = df_all[df_all['age'] >= 75].count()[u'id'], df_par[df_par['age'] >= 75].count()[u'id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'>=75 数目', u'比例']))
# 支付方式
total = df_all[df_all[u'payment_way'].isnull() == False].count()['id']
excel_objs.append(
pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'支付方式 总数', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'全自费'].count()['id'], \
df_par[df_par[u'payment_way'] == u'全自费'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'全自费 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'新型农村合作医疗'].count()['id'], \
df_par[df_par[u'payment_way'] == u'新型农村合作医疗'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'新型农村合作医疗 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'城镇居民基本医疗保险'].count()['id'], \
df_par[df_par[u'payment_way'] == u'城镇居民基本医疗保险'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇居民基本医疗保险 数目', u'比例']))
count, count_par = df_all[df_all[u'payment_way'] == u'城镇职工基本医疗保险'].count()['id'], \
df_par[df_par[u'payment_way'] == u'城镇职工基本医疗保险'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇职工基本医疗保险 数目', u'比例']))
count, count_par = df_all[(df_all[u'payment_way'] != u'城镇职工基本医疗保险') & (df_all[u'payment_way'] != u'新型农村合作医疗') & (df_all[u'payment_way'] != u'城镇居民基本医疗保险') & (df_all[u'payment_way'] != u'城镇职工基本医疗保险')].count()['id'], \
df_par[(df_all[u'payment_way'] != u'城镇职工基本医疗保险') & (df_par[u'payment_way'] != u'新型农村合作医疗') & (df_par[u'payment_way'] != u'城镇居民基本医疗保险') & (df_par[u'payment_way'] != u'城镇职工基本医疗保险')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'城镇职工基本医疗保险 数目', u'比例']))
# 文化程度
total = df_all[df_all[u'education'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'文化程度 总数', u'比例']))
count, count_par = df_all[(df_all[u'education'] == u'文盲及半文盲') | (df_all[u'education'] == u'小学') | (df_all[u'education'] == u'初中')].count()['id'], \
df_par[(df_all[u'education'] == u'文盲及半文盲') | (df_par[u'education'] == u'小学') | (df_par[u'education'] == u'初中')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'初中以下 数目', u'比例']))
count, count_par = df_all[df_all[u'education'] == u'高中/技校/中专'].count()['id'], \
df_par[df_par[u'education'] == u'高中/技校/中专'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'高中/技校/中专 数目', u'比例']))
count, count_par = df_all[df_all[u'education'] == u'大学专科及以上'].count()['id'], \
df_par[df_par[u'education'] == u'大学专科及以上'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'大学专科及以上 数目', u'比例']))
count, count_par = df_all[(df_all[u'education'] == u'不详') | (df_all[u'education'] == u'Unknown')].count()['id'], \
df_par[(df_all[u'education'] == u'不详') | (df_par[u'education'] == u'Unknown')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'不详 数目', u'比例']))
# 血型
total = df_all[df_all[u'blood_type'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'血型 总数', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'A型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'A型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'A型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'B型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'B型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'B型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'O型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'O型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'O型 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_type'] == u'AB型'].count()['id'], \
df_par[df_par[u'blood_type'] == u'AB型'].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'AB型 数目', u'比例']))
count, count_par = df_all[(df_all[u'blood_type'] == u'不详') | (df_all[u'blood_type'] == u'Unknown')].count()['id'], \
df_par[(df_all[u'blood_type'] == u'不详') | (df_par[u'blood_type'] == u'Unknown')].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'不详 数目', u'比例']))
# 高血压分层
total = df_all[df_all[u'blood_pressure_sbp'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'高血压分层 总数', u'比例']))
count, count_par = df_all[(df_all['blood_pressure_sbp'] < 120) & (df_all['blood_pressure_dbp'] < 80)].count()['id'], \
df_par[(df_par['blood_pressure_sbp'] < 120) & (df_par['blood_pressure_dbp'] < 80)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'理想血压 数目', u'比例']))
count, count_par = df_all[((df_all['blood_pressure_sbp'] >= 120) & (df_all['blood_pressure_sbp'] <= 139)) & ((df_all['blood_pressure_dbp'] >= 80) & (df_all['blood_pressure_dbp'] <= 89))].count()['id'], \
df_par[((df_par['blood_pressure_sbp'] >= 120) & (df_par['blood_pressure_sbp'] <= 139)) & ((df_par['blood_pressure_dbp'] >= 80) & (df_par['blood_pressure_dbp'] <= 89))].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'正常 数目', u'比例']))
count, count_par = df_all[((df_all['blood_pressure_sbp'] >= 140) & (df_all['blood_pressure_sbp'] <= 159)) & ((df_all['blood_pressure_dbp'] >= 90) & (df_all['blood_pressure_dbp'] <= 99))].count()['id'], \
df_par[((df_par['blood_pressure_sbp'] >= 140) & (df_par['blood_pressure_sbp'] <= 159)) & ((df_par['blood_pressure_dbp'] >= 90) & (df_par['blood_pressure_dbp'] <= 99))].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'一级血压 数目', u'比例']))
count, count_par = df_all[(df_all['blood_pressure_sbp'] >= 160) & (df_all['blood_pressure_dbp'] >= 100)].count()['id'], \
df_par[(df_par['blood_pressure_sbp'] >= 160) & (df_par['blood_pressure_dbp'] >= 100)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'二级血压及以上 数目', u'比例']))
# BMI
total = df_all[df_all[u'body_mass_index'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'BMI 总数', u'比例']))
count, count_par = df_all[df_all[u'body_mass_index'] < 24].count()['id'], \
df_par[df_par[u'body_mass_index'] < 24].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'< 24 数目', u'比例']))
count, count_par = df_all[(df_all[u'body_mass_index'] >= 24) & (df_all[u'body_mass_index'] < 28)].count()['id'], \
df_par[(df_all[u'body_mass_index'] >= 24) & (df_all[u'body_mass_index'] < 28)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'24-28 数目', u'比例']))
count, count_par = df_all[df_all[u'body_mass_index'] >= 28].count()['id'], \
df_par[df_par[u'body_mass_index'] >= 28].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'>= 28 数目', u'比例']))
# 空腹血糖
total = df_all[df_all[u'blood_glucose_mmol'].isnull() == False].count()['id']
excel_objs.append(pd.DataFrame([total, round(df_par.count()['id'] / float(total), 4) * 100], index=[u'空腹血糖 总数', u'比例']))
count, count_par = df_all[df_all[u'blood_glucose_mmol'] < 6.1].count()['id'], \
df_par[df_par[u'blood_glucose_mmol'] < 6.1].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'<6.1 数目', u'比例']))
count, count_par = df_all[(df_all[u'blood_glucose_mmol'] >= 6.1) & (df_all[u'blood_glucose_mmol'] < 7.0)].count()['id'], \
df_par[(df_all[u'blood_glucose_mmol'] >= 6.1) & (df_all[u'blood_glucose_mmol'] < 7.0)].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'6.1 - 7.0 数目', u'比例']))
count, count_par = df_all[df_all[u'blood_glucose_mmol'] >= 7.0].count()['id'], \
df_par[df_par[u'blood_glucose_mmol'] >= 7.0].count()['id']
excel_objs.append(pd.DataFrame([count, round(count_par / float(count), 4) * 100], index=[u'≥7.0 数目', u'比例']))
return pd.concat(excel_objs)
8.28
8.29
计算不同年龄组不同BMI水平下血压的控制率
年龄
35-44
<24
547
1009
24-28
627
1389
≥28
302
658
45-54
<24
2843
5572
24-28
3297
7705
≥28
1381
3371
55-64
<24
4841
9336
24-28
5471
12973
≥28
2236
5731
65-74
<24
3926
7910
24-28
4708
11078
≥28
1930
4866
≥75
<24
2203
4445
24-28
2666
6258
≥28
1097
2869
0 条评论
下一页
为你推荐
查看更多