當(dāng)前位置：首頁 > news >正文

什么是品牌設(shè)計(jì)重慶做優(yōu)化的網(wǎng)絡(luò)公司

news 2025/7/4 16:07:02

什么是品牌設(shè)計(jì),重慶做優(yōu)化的網(wǎng)絡(luò)公司,ps做網(wǎng)站編排,網(wǎng)站開發(fā)入股合作分配比例1.創(chuàng)建DataFrame對象概述 DataFrame是一個表格型的結(jié)構(gòu)化數(shù)據(jù)結(jié)構(gòu)，它含有一組或多組有序的列（Series），每列可以是不同的值類型（數(shù)值、字符串、布爾值等）。 DataFrame是Pandas中的最基本的數(shù)據(jù)結(jié)構(gòu)對象&am…

1.創(chuàng)建DataFrame對象

概述
- DataFrame是一個表格型的==結(jié)構(gòu)化==數(shù)據(jù)結(jié)構(gòu)，它含有一組或多組有序的列（Series），每列可以是不同的值類型（數(shù)值、字符串、布爾值等）。
- DataFrame是Pandas中的最基本的數(shù)據(jù)結(jié)構(gòu)對象，簡稱df；可以認(rèn)為df就是一個二維數(shù)據(jù)表，這個表有行有列有索引
- DataFrame是Pandas中最基本的數(shù)據(jù)結(jié)構(gòu)，Series的許多屬性和方法在DataFrame中也一樣適用.

創(chuàng)建方式

字典方式創(chuàng)建

import pandas as pd
?
dict_data = {'id': [1, 2, 3],'name': ['張三', '李四', '王五'],'age': [18, 20, 22]
}
# 使用默認(rèn)自增索引
# 字典中的key值是df對象的列名,value值是對應(yīng)列的數(shù)據(jù)值
df1 = pd.DataFrame(data=dict_data)
print(df1)
print(type(df1))
?
# 通過index參數(shù)指定索引, columns參數(shù)指定列的位置
df2 = pd.DataFrame(data=dict_data, index=['A', 'B', 'C'], columns=['id', 'age', 'name'])
print(df2)

列表+元組方式創(chuàng)建

list_data = [(1, '張三', 18),(2, '李四', 20),(3, '王五', 22)]
df3 = pd.DataFrame(data=list_data,index=['A', 'B', 'C'], ?# 手動指定索引columns=['id', 'name', 'age']) ?# 手動指定列名
print(df3)
?
# 輸出結(jié)果如下id ?name ?age
A ? 1 ?張三 ? 18
B ? 2 ?李四 ? 20
C ? 3 ?王五 ? 22

2.Series的常用屬性

常見屬性

屬性	說明
loc	使用索引值取子集
iloc	使用索引位置取子集
dtype或dtypes	Series內(nèi)容的類型
T	Series的轉(zhuǎn)置矩陣
shape	數(shù)據(jù)的維數(shù)
size	Series中元素的數(shù)量
values	Series的值
index	Series的索引值

代碼演示

# 加載數(shù)據(jù)
import pandas as pd
?
# 讀取csv文件, 設(shè)置 id列為: 索引列
data = pd.read_csv('data/nobel_prizes.csv', index_col='id')
data.head() # 默認(rèn)值只展示前5行數(shù)據(jù)

loc屬性

first_row = data.loc[941]
print(first_row) ? ? ? ?# 獲取第一行數(shù)據(jù), 但是是以列的方式展示的
print(type(first_row)) ?# <class 'pandas.core.series.Series'>

iloc屬性

first_row = data.iloc[0] # 使用索引位置獲取自己
print(first_row) ? ? ? ?# 獲取第一行數(shù)據(jù), 但是是以列的方式展示的
print(type(first_row)) ?# <class 'pandas.core.series.Series'>

dtype 或者 dtypes

print(first_row.dtype) ? ? ? ? ?# 打印Series的元素類型, object表示字符串
print(first_row['year'].dtype) ?# 打印Series的year列的元素類型, int64
?
# 打印Series的year列的元素類型, 該列值為字符串, 字符串沒有dtype屬性, 所以報(bào)錯.
print(first_row['firstname'].dtype) ?
?

shape 和 size屬性

print(first_row.shape)      # 維度
?
# 結(jié)果為: (7,)     因?yàn)橛?列元素
?
?
print(first_row.size)       # 元素個數(shù): 7

values 屬性

print(first_row.values) # 獲取Series的元素值

index屬性

print(first_row.index) ?# 獲取Series的索引
?
print(first_row.keys()) # Series對象的keys()方法, 效果同上.

3.Series的常用方法

常見方法

方法	說明
append	連接兩個或多個Series
corr	計(jì)算與另一個Series的相關(guān)系數(shù)
cov	計(jì)算與另一個Series的協(xié)方差
describe	計(jì)算常見統(tǒng)計(jì)量
drop_duplicates	返回去重之后的Series
equals	判斷兩個Series是否相同
get_values	獲取Series的值，作用與values屬性相同
hist	繪制直方圖
isin	Series中是否包含某些值
min	返回最小值
max	返回最大值
mean	返回算術(shù)平均值
median	返回中位數(shù)
mode	返回眾數(shù)
quantile	返回指定位置的分位數(shù)
replace	用指定值代替Series中的值
sample	返回Series的隨機(jī)采樣值
sort_values	對值進(jìn)行排序
to_frame	把Series轉(zhuǎn)換為DataFrame
unique	去重返回?cái)?shù)組
value_counts	統(tǒng)計(jì)不同值數(shù)量
keys	獲取索引值
head	查看前5個值
tail	查看后5個值

代碼演示

import pandas as pd
?
# 創(chuàng)建s對象
s1 = pd.Series(data=[1, 2, 3, 4, 2, 3], index=['A', 'B', 'C', 'D', 'E', 'F'])
# 查看s對象值數(shù)量
print(len(s1))
# 查看s對象前5個值, n默認(rèn)等于5
print(s1.head())
print(s1.head(n=2))
# 查看s對象后5個值, n默認(rèn)等于5
print(s1.tail())
print(s1.tail(n=2))
# 獲取s對象的索引
print(s1.keys())
# s對象轉(zhuǎn)換成python列表
print(s1.tolist())
print(s1.to_list())
# s對象轉(zhuǎn)換成df對象
print(s1.to_frame())
# s對象中數(shù)據(jù)的基礎(chǔ)統(tǒng)計(jì)信息
print(s1.describe())
# s對象最大值、最小值、平均值、求和值...
print(s1.max())
print(s1.min())
print(s1.mean())
print(s1.sum())
# s對象數(shù)據(jù)值去重, 返回s對象
print(s1.drop_duplicates())
# s對象數(shù)據(jù)值去重, 返回?cái)?shù)組
print(s1.unique())
# s對象數(shù)據(jù)值排序, 默認(rèn)升序
print(s1.sort_values(ascending=True))
# s對象索引值排序, 默認(rèn)升序
print(s1.sort_index(ascending=False))
# s對象不同值的數(shù)量, 類似于分組計(jì)數(shù)操作
print(s1.value_counts())

小案例: 電影數(shù)據(jù)

# 加載電影數(shù)據(jù)
movie = pd.read_csv('data/movie.csv') ? 
movie.head()
?
# 獲取 導(dǎo)演名(列)
director = movie.director_name ? ? ?# 導(dǎo)演名
director = movie['director_name'] ? # 導(dǎo)演名, 效果同上
director
?
# 獲取 主演在臉書的點(diǎn)贊數(shù)(列)
actor_1_fb_likes = movie.actor_1_facebook_likes # 主演在臉書的點(diǎn)贊數(shù)
actor_1_fb_likes.head()
?
# 統(tǒng)計(jì)相關(guān)
director.value_counts() # 不同導(dǎo)演的 電影數(shù)
director.count() ? ? ? ?# 統(tǒng)計(jì)非空值(即: 有導(dǎo)演名的電影, 共有多少),  4814 
director.shape ? ? ? ? ?# 總數(shù)(包括null值), (4916,)
?
# 查看詳情
actor_1_fb_likes.describe() # 顯示主演在臉書點(diǎn)擊量的詳細(xì)信息: 總數(shù),平均值,方差等...
director.describe() ? ? ? ? # 因?yàn)槭亲址? 只顯示部分統(tǒng)計(jì)信息

4.Series的布爾索引

從scientists.csv數(shù)據(jù)集中，列出大于Age列的平均值的具體值，具體步驟如下：

加載并觀察數(shù)據(jù)集

import pandas as pd
?
df = pd.read_csv('data/scientists.csv')
print(df)
# print(df.head())
# 輸出結(jié)果如下Name ? ? ? ?Born ? ? ? ?Died ?Age ? ? ? ? ?Occupation
0 ? ? Rosaline Franklin ?1920-07-25 ?1958-04-16 ? 37 ? ? ? ? ? ? Chemist
1 ? ? ? ?William Gosset ?1876-06-13 ?1937-10-16 ? 61 ? ? ? ?Statistician
2 ?Florence Nightingale ?1820-05-12 ?1910-08-13 ? 90 ? ? ? ? ? ? ? Nurse
3 ? ? ? ? ? Marie Curie ?1867-11-07 ?1934-07-04 ? 66 ? ? ? ? ? ? Chemist
4 ? ? ? ? Rachel Carson ?1907-05-27 ?1964-04-14 ? 56 ? ? ? ? ? Biologist
5 ? ? ? ? ? ? John Snow ?1813-03-15 ?1858-06-16 ? 45 ? ? ? ? ? Physician
6 ? ? ? ? ? Alan Turing ?1912-06-23 ?1954-06-07 ? 41 ?Computer Scientist
7 ? ? ? ? ?Johann Gauss ?1777-04-30 ?1855-02-23 ? 77 ? ? ? Mathematicia
?
# 演示下, 如何通過布爾值獲取元素.
bool_values = [False, True, True, False, False, False, True, False]
df[bool_values]
?
# 輸出結(jié)果如下Name ? ? ? ?Born ? ? ? ?Died ?Age ? ? ? ? ?Occupation
1 ? ? ? ?William Gosset ?1876-06-13 ?1937-10-16 ? 61 ? ? ? ?Statistician
2 ?Florence Nightingale ?1820-05-12 ?1910-08-13 ? 90 ? ? ? ? ? ? ? Nurse
6 ? ? ? ? ? Alan Turing ?1912-06-23 ?1954-06-07 ? 41 ?Computer Scientist
?

計(jì)算Age列的平均值

# 獲取一列數(shù)據(jù) df[列名]
ages = df['Age']
print(ages)
print(type(ages))
print(ages.mean())
?
# 輸出結(jié)果如下
0 ? ?37
1 ? ?61
2 ? ?90
3 ? ?66
4 ? ?56
5 ? ?45
6 ? ?41
7 ? ?77
Name: Age, dtype: int64
<class 'pandas.core.series.Series'>
59.125

輸出大于Age列的平均值的具體值

print(ages[ages > ages.mean()])
?
# 輸出結(jié)果如下
1 ? ?61
2 ? ?90
3 ? ?66
7 ? ?77
Name: Age, dtype: int64

總結(jié)

# 上述格式, 可以用一行代碼搞定, 具體如下
df[ages > avg_age] ? ? ? ? ? ? ? ? ?# 篩選(活的)年齡 大于 平均年齡的科學(xué)家信息
df[df['Age'] > df.Age.mean()] ? ? ? # 合并版寫法.

5.Series的運(yùn)算

Series和數(shù)值型變量計(jì)算時，變量會與Series中的每個元素逐一進(jìn)行計(jì)算；

兩個Series之間計(jì)算時，索引值相同的元素之間會進(jìn)行計(jì)算；索引值不同的元素的計(jì)算結(jié)果會用NaN值(缺失值)填充。

Series和數(shù)值型變量計(jì)算

# 加法
print(ages + 10)
# 乘法
print(ages * 2)
?
# 輸出結(jié)果如下
0 ? ? 47
1 ? ? 71
2 ? ?100
3 ? ? 76
4 ? ? 66
5 ? ? 55
6 ? ? 51
7 ? ? 87
Name: Age, dtype: int64
0 ? ? 74
1 ? ?122
2 ? ?180
3 ? ?132
4 ? ?112
5 ? ? 90
6 ? ? 82
7 ? ?154
Name: Age, dtype: int64

兩個Series之間計(jì)算時，索引值相同的元素之間會進(jìn)行計(jì)算；索引值不同的元素的計(jì)算結(jié)果會用NaN值(缺失值)填充

print(ages + ages)
print('=' * 20)
print(pd.Series([1, 100]))
print('=' * 20)
print(ages + pd.Series([1, 100]))
?
# 輸出結(jié)果如下
0 ? ? 74
1 ? ?122
2 ? ?180
3 ? ?132
4 ? ?112
5 ? ? 90
6 ? ? 82
7 ? ?154
Name: Age, dtype: int64
====================
0 ? ? ?1
1 ? ?100
dtype: int64
====================
0 ? ? 38.0
1 ? ?161.0
2 ? ? ?NaN
3 ? ? ?NaN
4 ? ? ?NaN
5 ? ? ?NaN
6 ? ? ?NaN
7 ? ? ?NaN
dtype: float64

6.DataFrame常用屬性和方法

基礎(chǔ)演示

import pandas as pd
?
# 加載數(shù)據(jù)集, 得到df對象
df = pd.read_csv('data/scientists.csv')
?
print('=============== 常用屬性 ===============')
# 查看維度, 返回元組類型 -> (行數(shù), 列數(shù)), 元素個數(shù)代表維度數(shù)
print(df.shape)
# 查看數(shù)據(jù)值個數(shù), 行數(shù)*列數(shù), NaN值也算
print(df.size)
# 查看數(shù)據(jù)值, 返回numpy的ndarray類型
print(df.values)
# 查看維度數(shù)
print(df.ndim)
# 返回列名和列數(shù)據(jù)類型
print(df.dtypes)
# 查看索引值, 返回索引值對象
print(df.index)
# 查看列名, 返回列名對象
print(df.columns)
print('=============== 常用方法 ===============')
# 查看前5行數(shù)據(jù)
print(df.head())
# 查看后5行數(shù)據(jù)
print(df.tail())
# 查看df的基本信息
df.info()
# 查看df對象中所有數(shù)值列的描述統(tǒng)計(jì)信息
print(df.describe())
# 查看df對象中所有非數(shù)值列的描述統(tǒng)計(jì)信息
# exclude:不包含指定類型列
print(df.describe(exclude=['int', 'float']))
# 查看df對象中所有列的描述統(tǒng)計(jì)信息
# include:包含指定類型列, all代表所有類型
print(df.describe(include='all'))
# 查看df的行數(shù)
print(len(df))
# 查看df各列的最小值
print(df.min())
# 查看df各列的非空值個數(shù)
print(df.count())
# 查看df數(shù)值列的平均值
print(df.mean())

DataFrame的布爾索引

# 小案例, 同上, 主演臉書點(diǎn)贊量 > 主演臉書平均點(diǎn)贊量的
movie[movie['actor_1_facebook_likes'] > movie['actor_1_facebook_likes'].mean()]
?
# df也支持索引操作
movie.head()[[True, True, False, True, False]]

DataFrame的計(jì)算

scientists * 2                  # 每個元素, 分別和數(shù)值運(yùn)算
scientists + scientists         # 根據(jù)索引進(jìn)行對應(yīng)運(yùn)算
scientists + scientists[:4]     # 根據(jù)索引進(jìn)行對應(yīng)運(yùn)算, 索引不匹配, 返回NAN

7. DataFrame-索引操作

Pandas中99%關(guān)于DF和Series調(diào)整的API, 都會默認(rèn)在副本上進(jìn)行修改, 調(diào)用修改的方法后, 會把這個副本返回

這類API都有一個共同的參數(shù): inplace, 默認(rèn)值是False

如果把inplace的值改為True, 就會直接修改原來的數(shù)據(jù), 此時這個方法就沒有返回值了

通過 set_index()函數(shù) 設(shè)置行索引名字

# 讀取文件, 不指定索引, Pandas會自動加上從0開始的索引
movie = pd.read_csv('data/movie.csv') ?
movie.head()
?
# 設(shè)置 電影名 為索引列. ? 
movie1 = movie.set_index('movie_title')
movie1.head()
?
# 如果加上 inplace=True, 則會修改原始的df對象
movie.set_index('movie_title', inplace=True) ?
movie.head() ? ?# 原始的數(shù)據(jù)并沒有發(fā)生改變.

加載數(shù)據(jù)的時候, 直接指定索引列

通過reset_index()函數(shù), 可以重置索引

# 加上inplace, 就是直接修改 源數(shù)據(jù).
movie.reset_index(inplace=True) 
movie.head()

8.DataFrame-修改行列索引

方式1: rename()函數(shù), 可以對原有的行索引名和列名進(jìn)行修改

movie = pd.read_csv('data/movie.csv', index_col='movie_title')
movie.index[:5] ?# 前5個行索引名
?
movie.columns[:5] ?# 前5個列名
?
# 手動修改下 行索引名 和 列名
idx_rename = {'Avatar': '阿凡達(dá)', "Pirates of the Caribbean: At World's End": '加勒比海盜'}
col_rename = {'color': '顏色', 'director_name': '導(dǎo)演名'}
?
# 通過rename()函數(shù), 對原有的行索引名 和 列名進(jìn)行修改
movie.rename(index=idx_rename, columns=col_rename).head()

方式2:把 index 和 columns屬性提取出來, 修改之后, 再賦值回去

index類型不能直接修改,需要先將其轉(zhuǎn)成列表, 修改列表元素, 再整體替換

movie = pd.read_csv('data/movie.csv', index_col='movie_title')
?
# 提取出 行索引名 和 列名, 并轉(zhuǎn)成列表.
index_list = movie.index.tolist()
columns_list = movie.columns.tolist()
?
# 修改列表元素值
index_list[0] = '阿凡達(dá)'
index_list[1] = '加勒比海盜'
?
columns_list[0] = '顏色'
columns_list[1] = '導(dǎo)演名'
?
# 重新把修改后的值, 設(shè)置成 行索引 和 列名
movie.index = index_list
movie.columns = columns_list
?
# 查看數(shù)據(jù)
movie.head(5)

9.添加-刪除-插入列

添加列

movie = pd.read_csv('data/movie.csv')
?
# 通過 df[列名] = 值  的方式, 可以給df對象新增一列, 默認(rèn): 在df對象的最后添加一列.
movie['has_seen'] = 0 ? # 新增一列, 表示: 是否看過(該電影) ? 
?
# 新增一列, 表示: 導(dǎo)演和演員 臉書總點(diǎn)贊數(shù)
movie['actor_director_facebook_likes'] = (movie['actor_1_facebook_likes'] +movie['actor_2_facebook_likes'] +movie['actor_3_facebook_likes'] +movie['director_facebook_likes']
)
?
movie.head() ? ?# 查看內(nèi)容

刪除列或者行

# movie.drop('has_seen')  # 報(bào)錯, 需要指定方式, 按行刪, 還是按列刪.
# movie.drop('has_seen', axis='columns') ?  # 按列刪
# movie.drop('has_seen', axis=1) ? ? ? ? ?  # 按列刪, 這里的1表示: 列
?
movie.head().drop([0, 1]) ? ? ? ? ? ? ? ?# 按行索引刪, 即: 刪除索引為0和1的行

插入列

有點(diǎn)特殊, 沒有inplace參數(shù), 默認(rèn)就是在原始df對象上做插入的.

# insert() 表示插入列. ? 參數(shù)解釋: loc:插入位置(從索引0開始計(jì)數(shù)), column=列名, value=值
# 總利潤 = 總收入 - 總預(yù)算
movie.insert(loc=1, column='profit', value=movie['gross'] - movie['budget'])
movie.head()

查看全文

http://m.aloenet.com.cn/news/38005.html

国产亚洲精品福利在线无卡一,国产精久久一区二区三区,亚洲精品无码国模,精品久久久久久无码专区不卡