Step_14_Data_Collection_Types

Python 기초

by DeepLearning Engineer 2021. 6. 7. 12:51

SMALL

Step_14_Data_Collection_Types

수업 내용 이해¶

우리가 배운 데이터 타입을 다시 한번 생각해 보면, "정수, 실수, Boolean, 문자열, 리스트"였습니다.
이 중, 정수와 실수는 하나의 숫자만 가지는 단순 데이터 타입이므로, 결국 여러 정보를 넣을 수 있는 타입은 String과 리스트 입니다.
String의 경우는 처리에 제약(숫자와 글자가 섞일수는 있으나, 하나의 덩어리로 취급 해야 하므로)이 있으므로, 결국 리스트만이 우리가 배운 유일한 것이였습니다.
물론 리스트도 유용한 타입이지만, Python은 프로그램의 목적에 맞는 다양한 타입을 제공합니다.
본 수업에서는 리스트처럼 복수의 데이터를 처리하면, 각각 서로 다른 특징과 용도를 갖는 tuple, set, dictionary에 대해서 배웁니다.

Data Collection Type 요점 정리¶

Python이 지원하는 Data collection type을 미리 다음의 표로 정리해 두었습니다.
각각 Mutable (수정 가능 여부), Ordered (특정 조건에 따라 정렬되어 있는지 여부), Duplication (중복 가능 여부)의 특징이 가능한지 여부와,
Data collection type별 사용하는 기호 및 예제를 나타내었습니다.
그리고 마지막으로 결과값을 저장하기 위한 용도로 빈 data collection을 만들어야 하는 경우의 문법을 요약했습니다.
직접 프로그램을 개발하면서 어려번 사용하면 익숙해 지겠지만, 지금은 개념만 이해하기에 헷갈리는 것이 정상이므로, 필요시 아래의 표를 수시로 참조하십시오.

특별히 순서대로 나열할 수 있는가의 여부(Ordered)는 프로그램에서 데이터를 다룰때 매우 신중하게 고려할 조건이지만,
Python은 유용한 모듈을 통해서 이러한 문제점을 해결할 수 있습니다.
궁금한 사람은 < https://docs.python.org/3/library/collections.html#collections.OrderedDict >의 collections 모듈을 읽어봅니다.

Collection	Mutable	Ordered	Duplication	Notation	Description	Empty Collection
string	no	yes	yes	[ ]	"simple string"	s = '' or s = ""
list	yes	yes	yes	[ ]	[ ["item0_0", 12 ], ["item1_0", 22 ] ]	l = []
tuple	no	yes	yes	( )	( ["item0_0", 12 ], ["item1_0", 22 ] )	t = ()
set	yes	no	no	{ }	{ 1, 2, 3, 4, 5 }	S = set()
dictionary	yes	no	no	{ }	{ 'one' : 1, 'two' : 2, 'three' : 3 }	d = {}

`set` 타입 이해하기¶

set은 정렬되지 않은 수학의 집합과 동일한 개념의 데이터 타입으로서, 중복된 값을 가질 수 없습니다.
아래의 예제에서 exSet1은 set의 규칙에 제대로 부합하는 경우 이지만, exSet2는 중복된 아이템('p')이 섞여 있는 것을 볼 수 있습니다.
따라서 실제 아래의 코드를 실행해 보면, exSet2에는 Python이 알아서 중복된 값을 하나 제거한 것을 실행에서 확인 할 수 있습니다.

exSet1 = { 1, 2, 3, 4, 5 }
print(exSet1)

exSet2 = { 'a', 'p', 'p', 'l', 'e' }
print(exSet2)

아래의 입력창에 위의 예제를 입력하고 실행하면서 결과를 이해해 봅니다.

In [1]:

exSet1 = { 1, 2, 3, 4, 5 }
print(exSet1)

exSet2 = { 'a', 'p', 'p', 'l', 'e' }
print(exSet2)

{1, 2, 3, 4, 5}
{'p', 'a', 'e', 'l'}

set은 set() 함수를 호출하여 생성하는데, set() 함수의 입력파라메타는 오로지 하나만 받을수 있으며, 다음처럼 줄수 있습니다.

(a) 공란: 파라메타 없이 set() 함수를 호출하여, 아이템이 없는 집합을 생성함
(b) 리스트: 전달받은 리스트의 아이템(element)이, 집합의 아이템으로 매칭된 집합을 생성하며, 중복된 아이템이 있는 경우는 하나만 포함함
(c) set: 받은 집항을 그대로 갖는 새로운 집합을 생성함
(d) range(n): 반복문에서 배운것과 같이 range() 함수에서 돌려주는 0부터 n-1까지의 정수를 갖는 집합을 생성함
(e) tuple: 리스트와 마찬가지로, 전달받은 tuple의 아이템(element)이, 집합의 아이템으로 매칭된 집합을 생성하며, 중복된 아이템이 있는 경우는 하나만 포함함

다음의 예제 프로그램은 위의 각 경우를 간단한 코드로 만든 예제입니다.

aSet = set()
print("(a)", aSet)

bSet = set([1,2,3,4,5])
print("(b)", bSet)

cSet = set({1,3,5,7,9})
print("(c)", cSet)

dSet = set(range(5))
print("(d)", dSet)

eSet = set((2,4,6,8,10))
print("(e)", eSet)

아래의 입력창에 위의 예제를 입력하고 실행하면서 결과를 이해해 봅니다.

In [2]:

aSet = set()
print("(a)", aSet)

bSet = set([1,2,3,4,5])
print("(b)", bSet)

cSet = set({1,3,5,7,9})
print("(c)", cSet)

dSet = set(range(5))
print("(d)", dSet)

eSet = set((2,4,6,8,10))
print("(e)", eSet)

(a) set()
(b) {1, 2, 3, 4, 5}
(c) {1, 3, 5, 7, 9}
(d) {0, 1, 2, 3, 4}
(e) {2, 4, 6, 8, 10}

set 타입도 Python 언어의 기본 Class 타입으로서, 다양한 Methods를 제공합니다.
set 타입이 제공하는 Methods에 대한 설명은 Python 언어 사이트 < https://docs.python.org/3.7/library/stdtypes.html?highlight=set#set >에서 확인이 가능합니다.

가장 기초가 되는 기능은 집합에 새로운 아이템을 추가하거나 제거하는 것일 겁니다.
이를 위한 Methods는 add()와 remove()이며, 각각 추가할 아이템 혹은 제거할 아이템을 입력 파라메타로 줍니다.
다음으로는, 집합하면 떠오르는 가장 기본적인 연산자들일 것인데, 이를 다음과 같이 표로 나타내었습니다.

연산자	수학 기호	Python 문법
합집합(union)	$A \cup B$	A.union(B) 혹은 A $	$ B
교집합(intersection)	$A \cap B$	A.intersection(B) 혹은 A & B
차집합(difference)	$A - B$	A.difference(B) 혹은 A - B
상위집합(superset)	$A \supseteq B$	A.issuperset(B) 혹은 A >= B
부분집합(subset)	$A \subseteq B$	A.issubset(B) 혹은 A <= B

다음의 프로그램은 위의 주요 연산자들을 사용하는 예제를 보여주고 있으니, 아래의 입력창을 통해서 실행해보십시오.
그리고 상기 사이트를 방문하여, set 타입의 다른 Methods들도 찾아보고 실제 동작시켜 보기 바랍니다.

aSet = {1,2,3,4,5}
bSet = {1,2,4,8,16}

aSet.add(6)
print(aSet)

aSet.remove(6)
print(aSet)

print(aSet | bSet)
print(aSet & bSet)
print(aSet - bSet)

print(aSet.union(bSet))
print(aSet.intersection(bSet))
print(aSet.difference(bSet))

print(aSet >= bSet)
print(aSet <= bSet)
print(aSet >= (aSet - bSet))

In [3]:

aSet = {1,2,3,4,5}
bSet = {1,2,4,8,16}

aSet.add(6)
print(aSet)

aSet.remove(6)
print(aSet)

print(aSet | bSet)
print(aSet & bSet)
print(aSet - bSet)

print(aSet.union(bSet))
print(aSet.intersection(bSet))
print(aSet.difference(bSet))

print(aSet >= bSet)
print(aSet <= bSet)
print(aSet >= (aSet - bSet))

{1, 2, 3, 4, 5, 6}
{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5, 8, 16}
{1, 2, 4}
{3, 5}
{1, 2, 3, 4, 5, 8, 16}
{1, 2, 4}
{3, 5}
False
False
True

`tuple` 타입 이해하기¶

tuple는 여러 부분 리스트와 매우 유사한데, 가장 큰 차이점은 tuple의 아이템은 변경이 불가(immutable)하다는 점입니다.
먼저 () 기호를 사용하여 tuple을 하나 만들어서 자세한 설명을 진행해 봅니다.

다음의 프로그램은 단순하게 숫자 5개를 가지는 tuple 타입의 tempTuple을 만든 후,
tempTuple의 내용을 출력하고 type() 함수를 통해서 tempTuple이 tuple 타입 임을 확인하는 목적으로 만든 것입니다.

tempTuple = (1,2,3,4,5)

print(tempTuple)
type(tempTuple)

아래의 입력창에 위의 예제를 입력하여 실제 결과를 확인해 봅니다.

In [4]:

tempTuple = (1,2,3,4,5)

print(tempTuple)
type(tempTuple)

(1, 2, 3, 4, 5)

Out[4]:

tuple

맨처음 항목인 숫자1을 확인하고자 한다면, 리스트와 동일하게 tempTuple[0]를 사용하면 됩니다.
리스트에서 배운것 처럼, tuple의 내용을 일부 잘라내기 위한 tempTuple[0:3]과 같은 문법도 동일하게 적용 가능합니다.
그러나 아이템의 내용이 변경 불가이므로, 다음의 코드처럼 tempTuple[0] = 0처럼 변경 시도하면 에러가 발행하는 것을 확인할 수 있습니다.

tempTuple = (1,2,3,4,5)

print(tempTuple[0])
print(tempTuple[0:3])

tempTuple[0] = 0 # Erron in this statement

아래의 입력창에 위의 예제를 입력하여 실제 결과를 확인해 봅니다.

In [5]:

tempTuple = (1,2,3,4,5)

print(tempTuple[0])
print(tempTuple[0:3])

tempTuple[0] = 0 # Erron in this statement

1
(1, 2, 3)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-466a14261a92> in <module>
      4 print(tempTuple[0:3])
      5 
----> 6 tempTuple[0] = 0 # Erron in this statement

TypeError: 'tuple' object does not support item assignment

재미있는 것은 tuple이 일반적으로 많이 사용되는 괄호(( ))를 사용하기 때문에, 단 하나의 아이템을 가지는 tuple을 만들때 주의를 해야 한다는 점입니다.

다음의 예제를 보면, minList는 일반적인 리스트의 선언으로, print() 구문에서 <class 'list'>이 출력됩니다.
그런데 tuple을 만드는데 사용하는 기호인 ( )를 사용하여, 아이템이 8 하나인 notTuple을 만들고 타입을 확인하면 <class 'int'>입니다.
무엇이 문제인가요? 일반적으로 ( )는 복수의 문장이 섞인 경우 우선도를 주기 위하여 많이 사용합니다.
즉 notTuple = (8)의 경우는 그냥 정수 하나를 ( )로 묶은 효과외엔 없는 겁니다.

그렇다면 단 하나의 아이템을 갖는 tuple은 어떻게 만들수 있을까요?
minTuple = (8,)과 같이 다소 이상해 보이지만, 아이템을 하나 적고, 그 다음에 , 기호를 넣어서 강제로 tuple을 만든다는 것을 Python에 알려주면 됩니다.

minList = [8]
print("[minList]", minList, type(minList))

notTuple = (8)
print("[notTuple]", notTuple, type(notTuple))

minTuple = (8,)
print("[minTuple]", minTuple, type(minTuple))

아래의 입력창에 위의 예제를 입력하여 실제 결과를 확인해 봅니다.

In [6]:

minList = [8]
print("[minList]", minList, type(minList))

notTuple = (8)
print("[notTuple]", notTuple, type(notTuple))

minTuple = (8,)
print("[minTuple]", minTuple, type(minTuple))

[minList] [8] <class 'list'>
[notTuple] 8 <class 'int'>
[minTuple] (8,) <class 'tuple'>

마지막으로 헷갈리지 말아야 하는 부분이 있습니다.
tuple의 아이템을 수정하면 안되지만, tuple의 아이템이 리스트와 같은 data collection type이라면, 이 안의 값은 수정이 가능합니다.
다음의 예제를 통해서 이를 확인해 봅시다.

아래의 예제에서 sampleList[0] = ["Python", 'B'] 구문은 tuple의 아이템을 직접 바꾸려고 하기에 에러가 납니다.
하지만 sampleList[0][1] = 'A'은 tuple의 아이템이 아닌, 아이템 안에 포함된 정보의 수정을 하려는 목적으로 허용됩니다.
따라서 tuple의 아이템이 data collection type이라면, 아이템 자체는 바꿀수 없지만, 해당 아이템이 가진 정보의 수정은 가능한 것입니다.
아래 예제처럼 tuple의 아이템이 리스트 X라면, 리스트 X를 리스트 Y로 바꿀수는 없지만, 리스트 X가 가지는 내부 정보는 바꿀수 있는 것이지요.

sampleList = (["Python", 'B'], ["C++", 'B'])
print("[Before]", type(sampleList), sampleList)

sampleList[0][1] = 'A'
print("[trial.1]", type(sampleList), sampleList)

sampleList[0] = ["Python", 'B'] # Erron in this statement
print("[trial.2]", type(sampleList), sampleList)

print("[After]", type(sampleList), sampleList)

아래의 입력창에 위의 예제를 입력하여 실제 결과를 확인해 봅니다.

In [7]:

sampleList = (["Python", 'B'], ["C++", 'B'])
print("[Before]", type(sampleList), sampleList)

sampleList[0][1] = 'A'
print("[trial.1]", type(sampleList), sampleList)

sampleList[0] = ["Python", 'B'] # Erron in this statement
print("[trial.2]", type(sampleList), sampleList)

print("[After]", type(sampleList), sampleList)

[Before] <class 'tuple'> (['Python', 'B'], ['C++', 'B'])
[trial.1] <class 'tuple'> (['Python', 'A'], ['C++', 'B'])

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-94cdf67eafe3> in <module>
      5 print("[trial.1]", type(sampleList), sampleList)
      6 
----> 7 sampleList[0] = ["Python", 'B'] # Erron in this statement
      8 print("[trial.2]", type(sampleList), sampleList)
      9 

TypeError: 'tuple' object does not support item assignment

`dictionary` 타입 이해하기¶

dictionary 타입은 이름이 의미하는 것처럼, 사전의 "단어:의미(값)"의 쌍을 저장하는 용도입니다.
dictionary 타입은 따라서 "단어"를 키워드로 사용하여야 하므로, "단어"는 중복이 허용되지 않으나, "의미(값)"은 중복되어도 상관없습니다.

dictionary 타입에서 자주 사용되는 Methods들을 살펴보기 위하여, 간단하게 프로그래밍 언어와 저자를 나타내는 langAuthor을 dictionary 타입으로 만들고,
이의 type()과 내용을 확인해 보면 다음과 같습니다.

langAuthor = {"python":"Guido van Rossum","C++":"Bjarne Stroustrup"}
print(type(langAuthor), langAuthor)

dictionary는 일반적으로 우리가 사용하는 사전과 같다고 했으니, 사전을 찾아볼까요?
다음과 같이 'python'과 'C++' 언어의 저자를 확인할 수 있습니다.
리스트와 tuple에서도 [ ] 기호를 사용하였지만, 당시는 숫자나 숫자값을 갖는 변수를 넣었습니다.
dictionary는 단어(혹은 Key 값)을 [ ] 기호안에 적어주며 결과값은 해당 단어(혹은 Key 값)에 상응하는 의미(값)입니다.

print(langAuthor['python'])
print(langAuthor['C++'])

dictionary의 단어(혹은 Key 값)은 변경이 불가하지만, 단어(혹은 Key 값)에 상응하는 의미(값)는 수정이 가능합니다.
다음의 코드 처럼 우리는 "C++" 언어의 저자 이름을 대문자로 바꾸어서 다시 저장해 보았습니다.

langAuthor['C++'] = "Bjarne Stroustrup".upper()
print(langAuthor['C++'])

dictionary의 아이템을 삭제하는 경우는 Python의 내장 문법인 del을 사용하면 됩니다.
다음 처럼 del 명령에 단어(혹은 Key 값)을 주면, 해당 정보는 사라지게 됩니다.

del langAuthor['C++']
print(type(langAuthor), langAuthor)

dictionary에 아이템을 추가하는 경우는 쉽습니다.
다음과 같이 새로운 단어(혹은 Key 값)을 [ ] 기호안에 적어주고, 이에 상응하는 값은 대입하면 됩니다.

langAuthor['C++'] = "Bjarne Stroustrup"
print(type(langAuthor), langAuthor)

리스트 혹은 tuple 처럼 반복문에서도 사용할 수 있습니다.
아래의 코드는 반복문을 사용하여, dictionary 안의 아이템을 하나 하나 읽어서 출력하는 것을 보여줍니다.

for item in langAuthor:
    print(item, "is designed by ", langAuthor[item])

지금까지 설명한 코드들을 하나의 프로그램으로 묶은 것이 아래의 프로그램입니다.

langAuthor = {"python":"Guido van Rossum","C++":"Bjarne Stroustrup"}
print(type(langAuthor), langAuthor)

print(langAuthor['python'])
print(langAuthor['C++'])

langAuthor['C++'] = "Bjarne Stroustrup".upper()
print(langAuthor['C++'])

del langAuthor['C++']
print(type(langAuthor), langAuthor)

langAuthor['C++'] = "Bjarne Stroustrup"
print(type(langAuthor), langAuthor)

for item in langAuthor:
    print(item, "is designed by ", langAuthor[item])

아래의 입력창에 위의 예제를 입력하여 실제 결과를 확인해 봅니다.

In [8]:

langAuthor = {"python":"Guido van Rossum","C++":"Bjarne Stroustrup"}
print(type(langAuthor), langAuthor)

print(langAuthor['python'])
print(langAuthor['C++'])

langAuthor['C++'] = "Bjarne Stroustrup".upper()
print(langAuthor['C++'])

del langAuthor['C++']
print(type(langAuthor), langAuthor)

langAuthor['C++'] = "Bjarne Stroustrup"
print(type(langAuthor), langAuthor)

for item in langAuthor:
    print(item, "is designed by ", langAuthor[item])

<class 'dict'> {'python': 'Guido van Rossum', 'C++': 'Bjarne Stroustrup'}
Guido van Rossum
Bjarne Stroustrup
BJARNE STROUSTRUP
<class 'dict'> {'python': 'Guido van Rossum'}
<class 'dict'> {'python': 'Guido van Rossum', 'C++': 'Bjarne Stroustrup'}
python is designed by  Guido van Rossum
C++ is designed by  Bjarne Stroustrup

실습 (LAB)¶

`list` >> `tuple` >> `set` 변환 프로그램 개발하기¶

다음의 요구 사항에 맞는 프로그램을 개발하여 아래의 입력창을 통해서 실행합니다.

(a) 다음과 같이 list 하나는 프로그래밍 언어를, 다른 list는 언어의 개발자 이름을 갖도록 선언합니다.

language = ["python", "c++", "javascript", "go"]
author = ["Guido van Rossum", "Bjarne Stroustrup", "Brendan Eich", "Robert Griesemer"]

(b) 함수 matingPairs()를 만드는데, 입력 파라메타로 위의 두 리스트를 받아서, 결과롤 set 타입을 돌려줍니다.
(c) 함수 matingPairs()는 두 리스트에서 각각 하나의 값을 꺼내서 언어 이름별 저자의 tuple을 만든 후,
(d) 함수 matingPairs() 안의 내부 변수인 set 타입 데이터 타입에 (c)에서 만든 tuple을 아이템으로 추가해 줍니다.
(e) 모든 언어에 대한 저자 매핑과, 이를 set에 넣는 과정을 마치면, 함수 matingPairs()은 결과값으로 set를 돌려줍니다.
(f) 함수 matingPairs()의 결과값을 화면에 출력합니다.

In [9]:

language = ["python", "c++", "javascript", "go"]
author = ["Guido van Rossum", "Bjarne Stroustrup", "Brendan Eich", "Robert Griesemer"]

def matingPairs(language,author):
    result_set = set()
    for i in range(len(author)):
        result_set.add((language[i],author[i]))
    return result_set

print(matingPairs(language,author))

{('javascript', 'Brendan Eich'), ('c++', 'Bjarne Stroustrup'), ('go', 'Robert Griesemer'), ('python', 'Guido van Rossum')}

`dictionary` 멤버 갯수 산출 프로그램 개발하기¶

다음의 요구 사항에 맞는 프로그램을 개발하여 아래의 입력창을 통해서 실행합니다.

(a) dictionary의 key는 유일해야 하지만 value는 유일하지 않아도 됩니다.
(b) count_values() 라는 이름의 함수를 구현합니다.
(c) count_values() 함수는 하나의 dictionary를 입력 파라메타로 받아서, 이 dictionary가 포함한 서로 다른 value의 개수를 반환합니다.
(d) 예를 들어, {'red': 1, 'green': 1, 'blue': 2}가 입력 파라메타로 전달되면, 2를 반환합니다.

In [14]:

input_dict = {'red': 1, 'green': 1, 'blue': 2}
def count_values(dic):
    result_set=set()
    for i in dic:
        result_set.add(dic[i])
    return len(result_set)
count_values(input_dict)

Out[14]:

LIST

'Python 기초' 카테고리의 다른 글

Step_13_File (0)	2021.06.04
Step_12_Loop_Part_2 (0)	2021.06.02
Step_11_Loop_Part_1 (0)	2021.05.30
Step_10_List (0)	2021.05.28
Step_09_Method_and_Class (0)	2021.05.26

파이썬 딥러닝

고정 헤더 영역

메뉴 레이어

메뉴 리스트

검색 레이어

검색 영역

상세 컨텐츠

본문 제목

본문

수업 내용 이해¶

Data Collection Type 요점 정리¶

`set` 타입 이해하기¶

`tuple` 타입 이해하기¶

`dictionary` 타입 이해하기¶

실습 (LAB)¶

`list` >> `tuple` >> `set` 변환 프로그램 개발하기¶

`dictionary` 멤버 갯수 산출 프로그램 개발하기¶

'Python 기초' 카테고리의 다른 글

관련글 더보기

댓글 영역

추가 정보

티스토리툴바

파이썬 딥러닝

고정 헤더 영역

메뉴 레이어

메뉴 리스트

검색 레이어

검색 영역

상세 컨텐츠

본문 제목

본문

수업 내용 이해¶

Data Collection Type 요점 정리¶

set 타입 이해하기¶

tuple 타입 이해하기¶

dictionary 타입 이해하기¶

실습 (LAB)¶

list >> tuple >> set 변환 프로그램 개발하기¶

dictionary 멤버 갯수 산출 프로그램 개발하기¶

'Python 기초' 카테고리의 다른 글

관련글 더보기

댓글 영역

추가 정보

티스토리툴바

`set` 타입 이해하기¶

`tuple` 타입 이해하기¶

`dictionary` 타입 이해하기¶

`list` >> `tuple` >> `set` 변환 프로그램 개발하기¶

`dictionary` 멤버 갯수 산출 프로그램 개발하기¶