2

Iterable/Sequences in Python

 2 years ago
source link: https://jyzhu.top/2021/10/25/Iterable-Sequences-in-Python/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Iterable/Sequences

Strings, ranges, lists and tuples.

“An iterable which supports efficient element access using integer indices via the __getitem__ special method and defines a __len__ method that returns the length of the sequence.”

  • use one index to fetch item in a list, the index can't out of the range; but if use slice method, then you can do it out of the range, because the slice will autolly find the greediest indexs.

  • assigning a list to another list, like a = b, makes a and b refers to the same list. And a list is mutable. So if we change a, then b will also be affected. It's called alias. Do avoid this.

  • Tuple also has order. Just it's immutable.

  • Fxxxk, Tuple and Set are different things! A set is like {1, 2, 3}, that is what I thought as not ordered and not duplicated. And a set has intersection() union() difference() symmetric_difference().

    To delete items in a set:

    set1.discard(6) # If the item doesn't exist, no exception
    set1.remove(6) # Raise an exception if doesn't exist
    set1.pop() # randomly delete one item.
  • you can insert an interable in a list's slice, though they have different length!

    >>> a = [1, 2, 3, 4, 5]
    >>> a[3:3] = [6, 7]
    >>> a
    [1, 2, 3, 6, 7, 4, 5]
  • list() takes an iterable, and makes it a list;

    str() shows you the whole object's look.

  • range() is also an iterable. It can be slicing, and len()

  • a, b = b, a: the b, a on the right firstly is packed into tuple: (b, a), then it is unpacked and individually assigned to each var on the left

Sequence operations

All: (list tuple str range)

image-20210825165741137image-20210825165741137image-20210825165757277image-20210825165757277

Except range:

image-20210825165814598image-20210825165814598

List:

image-20210825165842016image-20210825165842016

l.extend() treats the input as an iterable, and take each element of the iterable as a new element;

l.append() treats the input as a new element.

l.pop() Pop and return element of some index (if omitted, pop last element)

l.remove() Remove first occurrence of element

be careful which methods can cause runtime error

for-comprehension: elem for var in interable if cond

all(<iterable>): whether all are true

any(<interable>): any true

Generator expressions

only retrieve elements when need

Yield expression

Generator functions are written using the function* syntax. When called, generator functions do not initially execute their code. Instead, they return a special type of iterator, called a Generator. When a value is consumed by calling the generator's next method, the Generator function executes until it encounters the yield keyword.

>>> def f():
... cur = 0
... while True:
... yield cur
... cur += 1
>>> a = f()
>>> type(a)
generator
>>> next(a)
0
>>> next(a)
1
>>> for i in f():
... print(i)

0
1
2

###迭代器和生成器

  • 迭代器 an iterable,跟sequence不同的是它不能slicing,也不一定有len(),因为是可以无穷的。其定义就是一个实现了__iter()____next()__方法的对象。从道理上来讲是一个可以遍历sequence,并且记住遍历位置的对象。直到访问完所有元素,抛出一个StopIteration异常。
  • 生成器 a generator:使用了yield的函数,必然返回一个迭代器。用起来和迭代器是一样的。
    • 至于yield的用法,在运行生成器的时候,其实也就是next(<generator>)的时候,函数会运行,直到遇到yield,就暂停并保存当前运行信息,返回yield的值;下一次运行时从当前位置继续

Dict and Set

  • Use short-circuit evaluation to fetch a key in a Dict, but if it doesn't exist, it will return a False, instead of raising an error:

    'Toronto' in MLB_team and MLB_team['Toronto']
    # if 'Toronto' is in the list, => True and xxx, it will return the xxx;
    # if it's not in the list, => False and xxx, it will return False and skip running the xxx.

    But d.get() method is similar. It will give a None if the key doesn't exist.

  • set can only contains hashable items. set is unhashable. So a set can't be put into a set.

  • All mutable types are unhashable

  • set() function picking up an iterable and use it one by one to create a set. so set({1,2,3}) is ok.

  • frozenset is immutable, so also hashable. so can be put into a set.

>>> set([1, 1.0])
{1}

>>> n1 = 1234e23
>>> n2 = 1234e23
>>> id(n1), id(n2)
(4318748880, 4318749104)
>>> hash(n1), hash(n2)
(539300935830116283, 539300935830116283)
  • Strange: Python's behavior is different in shell/script/ways:
>>> n1, n2 = 1234e23, 1234e23
>>> id(n1), id(n2)
(4317529008, 4317529008)

1. hash for built-in data types

  • For built-in immutable data types (i.e. float, str, int), hash uses their value, so, x == y => hash(x) == hash(y)

  • For tuple, hash depends on its elements. So, some tuple are hashable, some are not. depends on whether members are all hashable:

    >>> t1, t2 = (1, 'abc'), (1, 'abc', [1])
    >>> hash(t1)
    7760687994730794974
    >>> hash(t2)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'list'
  • hash is None for built-in mutable types (i.e. list, set, dict)

2. hash for user defined classes
  • For user defined classes, by default:
    • __hash__ and __eq__ are based on id by default (but they can be overriden) (so, by default, hash, ==(equal) and is are the same)
    • duplicate is defined as: x == y <=> x is y and hash(x) == hash(y)
    • 如果你改了equal的逻辑,那么就要把hash的逻辑也改了,否则就会变成None
  • 说人话:
    • 默认来说,全都看id
    • 如果光改了eq,那hash就成none了。想保留hash的话,要满足这个条件:只要ab全等,必须ab哈希值一样(不然你是魔鬼)
  • 哈希冲突 (collision): 不同的元素(a == b: False)经过哈希函数后发现哈希值一样(hash(a) == hash(b))。是需要一些策略来进行处理,以确保把这些元素都储存下来的
  • 重复(duplicate):两个元素全等(而不是id一样!!!),hash得到的值又一样(a == b and hash(a) == hash(b)),自然认为是同一个元素,直接覆盖了

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK