Python에서 Excel 형식의 날짜를 어떻게 읽습니까?

즐거운 cyworld 2021. 1. 10. 16:45

Python에서 Excel 형식의 날짜를 어떻게 읽습니까?

Excel 날짜 (숫자 형식)를 Python에서 적절한 날짜로 어떻게 변환 할 수 있습니까?

xlrd 를 사용할 수 있습니다 .

자사에서 문서 , 당신은 날짜가 항상 숫자로 저장되어 있음을 읽을 수있다; 그러나을 사용 xldate_as_tuple하여 파이썬 날짜로 변환 할 수 있습니다 .

참고 : PyPI의 버전은 xlrd 웹 사이트에서 제공되는 버전보다 더 최신 버전으로 보입니다.

테스트 후 며칠 동안 피드백을 기다린 후 xlrd의 xldate 모듈에서 다음과 같은 완전히 새로운 기능을 svn-commit 할 것입니다. 아직 Python 2.1 또는 2.2를 실행하는 다이 하드에서는 사용할 수 없습니다.

# Convert an Excel number (presumed to represent a date, a datetime or a time) into
# a Python datetime.datetime
# @param xldate The Excel number
# @param datemode 0: 1900-based, 1: 1904-based.
# <br>WARNING: when using this function to
# interpret the contents of a workbook, you should pass in the Book.datemode
# attribute of that workbook. Whether
# the workbook has ever been anywhere near a Macintosh is irrelevant.
# @return a datetime.datetime object, to the nearest_second.
# <br>Special case: if 0.0 <= xldate < 1.0, it is assumed to represent a time;
# a datetime.time object will be returned.
# <br>Note: 1904-01-01 is not regarded as a valid date in the datemode 1 system; its "serial number"
# is zero.
# @throws XLDateNegative xldate < 0.00
# @throws XLDateAmbiguous The 1900 leap-year problem (datemode == 0 and 1.0 <= xldate < 61.0)
# @throws XLDateTooLarge Gregorian year 10000 or later
# @throws XLDateBadDatemode datemode arg is neither 0 nor 1
# @throws XLDateError Covers the 4 specific errors

def xldate_as_datetime(xldate, datemode):
    if datemode not in (0, 1):
        raise XLDateBadDatemode(datemode)
    if xldate == 0.00:
        return datetime.time(0, 0, 0)
    if xldate < 0.00:
        raise XLDateNegative(xldate)
    xldays = int(xldate)
    frac = xldate - xldays
    seconds = int(round(frac * 86400.0))
    assert 0 <= seconds <= 86400
    if seconds == 86400:
        seconds = 0
        xldays += 1
    if xldays >= _XLDAYS_TOO_LARGE[datemode]:
        raise XLDateTooLarge(xldate)

    if xldays == 0:
        # second = seconds % 60; minutes = seconds // 60
        minutes, second = divmod(seconds, 60)
        # minute = minutes % 60; hour    = minutes // 60
        hour, minute = divmod(minutes, 60)
        return datetime.time(hour, minute, second)

    if xldays < 61 and datemode == 0:
        raise XLDateAmbiguous(xldate)

    return (
        datetime.datetime.fromordinal(xldays + 693594 + 1462 * datemode)
        + datetime.timedelta(seconds=seconds)

다음은 너클 노 시트 벨트 자체 위험 사용 버전입니다.

import datetime

def minimalist_xldate_as_datetime(xldate, datemode):
    # datemode: 0 for 1900-based, 1 for 1904-based
    return (
        datetime.datetime(1899, 12, 30)
        + datetime.timedelta(days=xldate + 1462 * datemode)

xlrd.xldate_as_tuple좋지만 xlrd.xldate.xldate_as_datetimedatetime으로 변환되는 것도 있습니다.

import xlrd
wb = xlrd.open_workbook(filename)
xlrd.xldate.xldate_as_datetime(41889, wb.datemode)
=> datetime.datetime(2014, 9, 7, 0, 0)

이 링크를 참조하십시오 : Python xlrd를 사용하여 Excel에서 부동 소수점이 아닌 문자열로 날짜 읽기

그것은 나를 위해 일했습니다.

이 링크는 다음과 같습니다.

import datetime, xlrd
book = xlrd.open_workbook("myfile.xls")
sh = book.sheet_by_index(0)
a1 = sh.cell_value(rowx=0, colx=0)
a1_as_datetime = datetime.datetime(*xlrd.xldate_as_tuple(a1, book.datemode))
print 'datetime: %s' % a1_as_datetime

예상 상황

# Wrong output from cell_values()

# Expected output

예 : 시트 번호 0의 cell_values ​​(2,2)대상 날짜가되도록합니다.

다음과 같이 필요한 변수를 가져옵니다.

workbook = xlrd.open_workbook("target.xlsx")

sheet = workbook.sheet_by_index(0)

wrongValue = sheet.cell_value(2,2)

그리고 xldate_as_tuple 을 사용 하십시오.

y, m, d, h, i, s = xlrd.xldate_as_tuple(wrongValue, workbook.datemode)
print("{0} - {1} - {2}".format(y, m, d))

그게 내 해결책

판다를 사용하고 있고 read_excel이 Excel 숫자 형식이 잘못된 날짜로 읽히고 실제 날짜를 복구해야하는 경우 ...

lambda function열 사용에 적용되는 날짜를 다시 복구 할 수 xlrd

import xlrd
df['possible_intdate'] = df['possible_intdate'].apply(lambda s: xlrd.xldate.xldate_as_datetime(s, 0))

>> df['possible_intdate']


excel stores dates and times as a number representing the number of days since 1900-Jan-0, if you want to get the dates in date format using python, just subtract 2 days from the days column, as shown below:

Date = sheet.cell(1,0).value-2 //in python

at column 1 in my excel, i have my date and above command giving me date values minus 2 days, which is same as date present in my excel sheet

For quick and dirty:

year, month, day, hour, minute, second = xlrd.xldate_as_tuple(excelDate, wb.datemode)
whatYouWant = str(month)+'/'+str(day)+'/'+str(year)

A combination of peoples post gave me the date and the time for excel conversion. I did return it as a string

def xldate_to_datetime(xldate):
  tempDate = datetime.datetime(1900, 1, 1)
  deltaDays = datetime.timedelta(days=int(xldate))
  secs = (int((xldate%1)*86400)-60)
  detlaSeconds = datetime.timedelta(seconds=secs)
  TheTime = (tempDate + deltaDays + detlaSeconds )
  return TheTime.strftime("%Y-%m-%d %H:%M:%S")

Since there's a chance that your excel files are coming from different computers/people; there's a chance that the formatting is messy; so be extra cautious.

I just imported data from 50 odd excels where the dates were entered in DD/MM/YYYY or DD-MM-YYYY, but most of the Excel files stored them as MM/DD/YYYY (Probably because the PCs were setup with en-us instead of en-gb or en-in).

Even more irritating was the fact that dates above 13/MM/YYYY were in DD/MM/YYYY format still. So there was variations within the Excel files.

The most reliable solution I figured out was to manually set the Date column on each excel file to to be Plain Text -- then use this code to parse it:

if date_str_from_excel:
        return datetime.strptime(date_str_from_excel, '%d/%m/%Y')
    except ValueError:
        print("Unable to parse date")

This is a revised version from @hounded. My code handles both date and time, something like 43705.591795706

    import math
    import datetime

    def xldate_to_datetime(xldatetime): #something like 43705.6158241088

      tempDate = datetime.datetime(1899, 12, 31)
      (days, portion) = math.modf(xldatetime)

      deltaDays = datetime.timedelta(days=days)
      #changing the variable name in the edit
      secs = int(24 * 60 * 60 * portion)
      detlaSeconds = datetime.timedelta(seconds=secs)
      TheTime = (tempDate + deltaDays + detlaSeconds )
      return TheTime.strftime("%Y-%m-%d %H:%M:%S")

# 2019-08-29 14:46:47

When converting an excel file to CSV the date/time cell looks like this:

foo, 3/16/2016 10:38, bar,

To convert the datetime text value to datetime python object do this:

from datetime import datetime

date_object = datetime.strptime('3/16/2016 10:38', '%m/%d/%Y %H:%M')    # excel format (CSV file)

print date_object will return 2005-06-01 13:33:00

ReferenceURL :