Programing

dict에서 빈 문자열이있는 키를 효율적으로 제거하는 방법

lottogame 2020. 8. 15. 09:46
반응형

dict에서 빈 문자열이있는 키를 효율적으로 제거하는 방법


사전이 있고 빈 값 문자열이있는 모든 키를 제거하고 싶습니다.

metadata = {u'Composite:PreviewImage': u'(Binary data 101973 bytes)',
            u'EXIF:CFAPattern2': u''}

이를 수행하는 가장 좋은 방법은 무엇입니까?


파이썬 2.X

dict((k, v) for k, v in metadata.iteritems() if v)

파이썬 2.7-3.X

{k: v for k, v in metadata.items() if v is not None}

모든 키에는 값이 있습니다. 그 값 중 일부는 빈 문자열입니다. 딕셔너리에 값이없는 키 같은 것은 없습니다. 값이 없으면 사전에 없을 것입니다.


BrenBarn의 솔루션 보다 더 짧아 질 수 있습니다 (그리고 더 읽기 쉽습니다 )

{k: v for k, v in metadata.items() if v}

Python 2.7.3으로 테스트되었습니다.


원래 사전을 수정해야하는 경우 :

empty_keys = [k for k,v in metadata.iteritems() if not v]
for k in empty_keys:
    del metadata[k]

딕셔너리를 반복하는 동안에는 딕셔너리를 수정할 수 없기 때문에 빈 키 목록을 만들어야합니다 (알다시피). 값이 비어있는 항목이 많은 경우를 제외하고는 새로운 사전을 만드는 것보다 비용이 적게 듭니다 (메모리 측면에서).


종종 중첩되고주기를 포함 할 수있는 실제 데이터 구조를 처리하는 데 완전한 기능을 제공하면서도 간결한 접근 방식을 원한다면 boltons 유틸리티 패키지에서 remap 유틸리티를 살펴 보는 것이 좋습니다 .

iterutils.py 를 프로젝트에 pip install boltons복사 한 후 다음을 수행하십시오.

from boltons.iterutils import remap

drop_falsey = lambda path, key, value: bool(value)
clean = remap(metadata, visit=drop_falsey)

이 페이지 에는 Github의 API에서 훨씬 더 큰 객체로 작업하는 예제를 포함하여 더 많은 예제가 있습니다.

순수한 Python이므로 모든 곳에서 작동하며 Python 2.7 및 3.3 이상에서 완전히 테스트되었습니다. 무엇보다도 저는 이와 같은 경우를 위해 작성 했으므로 처리되지 않는 경우를 발견하면 여기에서 바로 수정하도록 저를 괴롭힐 수 있습니다 .


BrenBarn의 솔루션 은 이상적입니다 (그리고 pythonic, 추가 할 수도 있습니다). 그러나 다음은 또 다른 (fp) 솔루션입니다.

from operator import itemgetter
dict(filter(itemgetter(1), metadata.items()))

를 기반으로 라이언의 솔루션 , 당신은 또한 목록 및 중첩 된 사전이있는 경우 :

Python 2 :

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.iteritems() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

Python 3 :

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.items() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

중첩 된 사전이 있고 빈 하위 요소에 대해서도 작동하도록하려면 BrenBarn의 제안에 대한 재귀 변형을 사용할 수 있습니다.

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

빠른 답변 (TL; DR)

예 01

### example01 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",                        
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(vdata) ])
print newdict

### result01 -------------------
result01 ='''
{'foxy': 'False', 'charlie': 'three', 'bravo': '0'}
'''

자세한 답변

문제

  • 컨텍스트 : Python 2.x
  • 시나리오 : 개발자가 사전을 수정하여 빈 값을 제외하려고합니다.
    • aka remove empty values from a dictionary
    • aka delete keys with blank values
    • aka filter dictionary for non-blank values over each key-value pair

Solution

  • example01 use python list-comprehension syntax with simple conditional to remove "empty" values

Pitfalls

  • example01 only operates on a copy of the original dictionary (does not modify in place)
  • example01 may produce unexpected results depending on what developer means by "empty"
    • Does developer mean to keep values that are falsy?
    • If the values in the dictionary are not gauranteed to be strings, developer may have unexpected data loss.
    • result01 shows that only three key-value pairs were preserved from the original set

Alternate example

  • example02 helps deal with potential pitfalls
  • The approach is to use a more precise definition of "empty" by changing the conditional.
  • Here we only want to filter out values that evaluate to blank strings.
  • Here we also use .strip() to filter out values that consist of only whitespace.

Example02

### example02 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(str(vdata).strip()) ])
print newdict

### result02 -------------------
result02 ='''
{'alpha': 0,
  'bravo': '0', 
  'charlie': 'three', 
  'delta': [],
  'echo': False,
  'foxy': 'False'
  }
'''

See also


Building on the answers from patriciasz and nneonneo, and accounting for the possibility that you might want to delete keys that have only certain falsy things (e.g. '') but not others (e.g. 0), or perhaps you even want to include some truthy things (e.g. 'SPAM'), then you could make a highly specific hitlist:

unwanted = ['', u'', None, False, [], 'SPAM']

Unfortunately, this doesn't quite work, because for example 0 in unwanted evaluates to True. We need to discriminate between 0 and other falsy things, so we have to use is:

any([0 is i for i in unwanted])

...evaluates to False.

Now use it to del the unwanted things:

unwanted_keys = [k for k, v in metadata.items() if any([v is i for i in unwanted])]
for k in unwanted_keys: del metadata[k]

If you want a new dictionary, instead of modifying metadata in place:

newdict = {k: v for k, v in metadata.items() if not any([v is i for i in unwanted])}

For python 3

dict((k, v) for k, v in metadata.items() if v)

I read all replies in this thread and some referred also to this thread: Remove empty dicts in nested dictionary with recursive function

I originally used solution here and it worked great:

Attempt 1: Too Hot (not performant or future-proof):

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

But some performance and compatibility concerns were raised in Python 2.7 world:

  1. use isinstance instead of type
  2. unroll the list comp into for loop for efficiency
  3. use python3 safe items instead of iteritems

Attempt 2: Too Cold (Lacks Memoization):

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

DOH! This is not recursive and not at all memoizant.

Attempt 3: Just Right (so far):

def scrub_dict(d):
    new_dict = {}
    for k, v in d.items():
        if isinstance(v,dict):
            v = scrub_dict(v)
        if not v in (u'', None, {}):
            new_dict[k] = v
    return new_dict

An alternative way you can do this, is using dictionary comprehension. This should be compatible with 2.7+

result = {
    key: value for key, value in
    {"foo": "bar", "lorem": None}.items()
    if value
}

Here is an option if you are using pandas:

import pandas as pd

d = dict.fromkeys(['a', 'b', 'c', 'd'])
d['b'] = 'not null'
d['c'] = ''  # empty string

print(d)

# convert `dict` to `Series` and replace any blank strings with `None`;
# use the `.dropna()` method and
# then convert back to a `dict`
d_ = pd.Series(d).replace('', None).dropna().to_dict()

print(d_)

Some of Methods mentioned above ignores if there are any integers and float with values 0 & 0.0

If someone wants to avoid the above can use below code(removes empty strings and None values from nested dictionary and nested list):

def remove_empty_from_dict(d):
    if type(d) is dict:
        _temp = {}
        for k,v in d.items():
            if v == None or v == "":
                pass
            elif type(v) is int or type(v) is float:
                _temp[k] = remove_empty_from_dict(v)
            elif (v or remove_empty_from_dict(v)):
                _temp[k] = remove_empty_from_dict(v)
        return _temp
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if( (str(v).strip() or str(remove_empty_from_dict(v)).strip()) and (v != None or remove_empty_from_dict(v) != None))]
    else:
        return d

Some benchmarking:

1. List comprehension recreate dict

In [7]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = {k: v for k, v in dic.items() if v is not None} 
   1000000 loops, best of 7: 375 ns per loop

2. List comprehension recreate dict using dict()

In [8]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
   ...: dic = dict((k, v) for k, v in dic.items() if v is not None)
1000000 loops, best of 7: 681 ns per loop

3. Loop and delete key if v is None

In [10]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
    ...: for k, v in dic.items():
    ...:   if v is None:
    ...:     del dic[k]
    ...: 
10000000 loops, best of 7: 160 ns per loop

so loop and delete is the fastest at 160ns, list comprehension is half as slow at ~375ns and with a call to dict() is half as slow again ~680ns.

Wrapping 3 into a function brings it back down again to about 275ns. Also for me PyPy was about twice as fast as neet python.

참고URL : https://stackoverflow.com/questions/12118695/efficient-way-to-remove-keys-with-empty-strings-from-a-dict

반응형