데이터 프레임 열의 데이터 유형 결정

Programing

데이터 프레임 열의 데이터 유형 결정

lottogame 2020. 7. 2. 07:49

데이터 프레임 열의 데이터 유형 결정

R을 사용하고 있으며를 사용하여 데이터를 데이터 프레임에로드했습니다 read.csv(). 데이터 프레임에서 각 열의 데이터 유형을 어떻게 확인합니까?

시작하는 가장 좋은 방법은를 사용하는 것 ?str()입니다. 몇 가지 예를 살펴 보려면 몇 가지 데이터를 만들어 보겠습니다.

set.seed(3221)  # this makes the example exactly reproducible
my.data <- data.frame(y=rnorm(5), 
                      x1=c(1:5), 
                      x2=c(TRUE, TRUE, FALSE, FALSE, FALSE),
                      X3=letters[1:5])

@ Wilmer E Henao H의 솔루션은 매우 간소화되었습니다.

sapply(my.data, class)
        y        x1        x2        X3 
"numeric" "integer" "logical"  "factor"

를 사용 str()하면 해당 정보와 추가 장점 (예 : 요인 수준 및 각 변수의 처음 몇 값)을 얻을 수 있습니다.

str(my.data)
'data.frame':  5 obs. of  4 variables:
$ y : num  1.03 1.599 -0.818 0.872 -2.682
$ x1: int  1 2 3 4 5
$ x2: logi  TRUE TRUE FALSE FALSE FALSE
$ X3: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5

@Gavin Simpson의 접근 방식도 간소화되었지만 다음과는 약간 다른 정보를 제공합니다 class().

sapply(my.data, typeof)
       y        x1        x2        X3 
"double" "integer" "logical" "integer"

에 대한 자세한 내용은 class, typeof및 중간 아이, mode이 뛰어난 SO 스레드를 참조하십시오 R. '모드'와 '대해서 typeof' '클래스'과의 사물의 형태의 포괄적 인 조사가 불충분하다 .

sapply(yourdataframe, class)

dataframe은 사용중인 데이터 프레임의 이름입니다.

내가 제안 할게

sapply(foo, typeof)

데이터 프레임에 실제 유형의 벡터가 필요한 경우 class()다소 다른 짐승입니다.

이 정보를 벡터로 얻을 필요가없는 경우 (즉, 프로그래밍 방식으로 나중에 다른 작업을 수행 할 필요가없는 경우)을 사용하십시오 str(foo).

두 경우 모두 foo데이터 프레임의 이름으로 대체됩니다.

데이터 프레임을 다음 기능에 전달하면됩니다.

data_types <- function(frame) {
  res <- lapply(frame, class)
  res_frame <- data.frame(unlist(res))
  barplot(table(res_frame), main="Data Types", col="steelblue", ylab="Number of Features")
}

데이터 프레임에 모든 데이터 유형의 플롯을 생성합니다. 를 들어 조리개 데이터 셋 우리는 다음을 얻을 :

data_types(iris)

작은 데이터 프레임의 경우 :

library(tidyverse)

as_tibble(mtcars)

데이터 형식으로 df를 인쇄합니다.

# A tibble: 32 x 11
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
 * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1

큰 데이터 프레임의 경우 :

glimpse(mtcars)

데이터 유형의 구조화 된보기를 제공합니다.

Observations: 32
Variables: 11
$ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17....
$ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, ...
$ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6...
$ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215...
$ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.0...
$ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440...
$ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90...
$ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, ...
$ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, ...
$ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, ...
$ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, ...

열의 데이터 유형 목록을 얻으려면 (위의 @Alexandre가 말한 것처럼) :

map(mtcars, class)

데이터 유형 목록을 제공합니다.

$mpg
[1] "numeric"

$cyl
[1] "numeric"

$disp
[1] "numeric"

$hp
[1] "numeric"

열의 데이터 유형을 변경하려면 다음을 수행하십시오.

library(hablar)

mtcars %>% 
  convert(chr(mpg, am),
          int(carb))

열 변환 mpg및 am문자 컬럼에 carb정수를 :

# A tibble: 32 x 11
   mpg     cyl  disp    hp  drat    wt  qsec    vs am     gear  carb
   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <int>
 1 21        6  160    110  3.9   2.62  16.5     0 1         4     4
 2 21        6  160    110  3.9   2.88  17.0     0 1         4     4
 3 22.8      4  108     93  3.85  2.32  18.6     1 1         4     1
 4 21.4      6  258    110  3.08  3.22  19.4     1 0         3     1

명확하게 언급되지 않았으므로 다음을 추가하십시오.

모든 데이터 유형의 발생 횟수 를 보유 하는 테이블 을 작성하는 방법을 찾고있었습니다 .

data.frame두 개의 숫자와 하나의 논리 열 이 있다고 가정 해보십시오.

dta <- data.frame(a = c(1,2,3), 
                  b = c(4,5,6), 
                  c = c(TRUE, FALSE, TRUE))

각 데이터 유형의 열 수를 요약 할 수 있습니다.

table(unlist(lapply(dta, class)))
# logical numeric 
#       1       2

This comes extremely handy, if you have a lot of columns and want to get a quick overview.

To give credit: This solution was inspired by the answer of @Cybernetic.

Here is a function that is part of the helpRFunctions package that will return a list of all of the various data types in your data frame, as well as the specific variable names associated with that type.

install.package('devtools') # Only needed if you dont have this installed.
library(devtools)
install_github('adam-m-mcelhinney/helpRFunctions')
library(helpRFunctions)
my.data <- data.frame(y=rnorm(5), 
                  x1=c(1:5), 
                  x2=c(TRUE, TRUE, FALSE, FALSE, FALSE),
                  X3=letters[1:5])
t <- list.df.var.types(my.data)
t$factor
t$integer
t$logical
t$numeric

You could then do something like var(my.data[t$numeric]).

Hope this is helpful!

If you import the csv file as a data.frame (and not matrix), you can also use summary.default

summary.default(mtcars)

     Length Class  Mode   
mpg  32     -none- numeric
cyl  32     -none- numeric
disp 32     -none- numeric
hp   32     -none- numeric
drat 32     -none- numeric
wt   32     -none- numeric
qsec 32     -none- numeric
vs   32     -none- numeric
am   32     -none- numeric
gear 32     -none- numeric
carb 32     -none- numeric

Another option is using the map function of the purrr package.

library(purrr)
map(df,class)

참고URL : https://stackoverflow.com/questions/21125222/determine-the-data-types-of-a-data-frames-columns

'Programing' 카테고리의 다른 글

델리게이트로 C # 옵저버 / 관측 가능한 매우 간단한 예 (0)	2020.07.02
lapply와 do.call의 차이점은 무엇입니까? (0)	2020.07.02
몽고 쉘-콘솔 / 디버그 로그 (0)	2020.07.02
브라우저 탭에 포커스가 있는지 감지 (0)	2020.07.02
에 jQuery UI 제출 대화 상자 (0)	2020.07.02

현재글데이터 프레임 열의 데이터 유형 결정

복권의 역사, 로또 정보와 IT 기술 등을 다루는 블로그입니다.

무비순위, 놀거리, Spring3, 축제, 여행, 가족나들이, 행사, 볼거리, 관광, spring, 극장순위, Javascript, c++, JQuery, 연극, c#, 자바, java, 뮤지컬, 공연,

Today :
Yesterday :

lottogame

데이터 프레임 열의 데이터 유형 결정

데이터 프레임 열의 데이터 유형 결정

'Programing' 카테고리의 다른 글

'Programing'의 다른글

티스토리툴바

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

데이터 프레임 열의 데이터 유형 결정

데이터 프레임 열의 데이터 유형 결정

'Programing' 카테고리의 다른 글

'Programing'의 다른글

관련글

티스토리툴바