Part 2 in an irregular series about migrating from MATLAB to Python.
Slice notation allows you to quickly take a subset of a vector or matrix, by using a short expression in place of the indices.
In MATLAB you may be familiar with
>> a = [1,2,3,4,5,6,7,8,9,10]; >> a(3:7) ans = 3 4 5 6 7
The “slice notation” a(3:7) takes the subset of the vector a starting at index 3, up to and including index 7.
Slice notation in Python is similar, but just different enough to cause endless confusion for MATLAB programmers.
In this post we’ll start with the worst of it. In Python, the starting index of any list is 0, and the last value in the slice is not included in the indices. The best way to see how this screws things up is by example:
MATLAB | Python |
>> a = [1,2,3,4,5]; >> a(1:4) ans = 1 2 3 4 >> a(4) ans = 4 |
>>> a = [1,2,3,4,5] >>> a[1:4] [2, 3, 4] >>> a[4] 5 |
So what happened here?
In MATLAB it’s clear: a(1:4) is the vector consisting of a(1), a(2), a(3), and a(4), from the first index to the fourth — nice and intuitive.
In Python the starting value is strictly the same, a[1] — but because vector indices start at 0, a[1] is the second element of the list: a[1] = 2. For this you can probably blame the C language, which uses this convention.
So you might expect a[1:4] to consist of a[1], a[2], a[3], a[4]. But no — the last entry in the slice is excluded from the list for some reason. Thus we have a[1:4] consisting of a[1], a[2], and a[3] (which are 2, 3, and 4, respectively). In the above example, the fifth element of the list, a[4] = 5, is outside the slice.
I find this to be bafflingly counterintuitive, and it took a long time before I got comfortable with it. But the good news is Python slice notation has useful features which are not found in MATLAB; I’ll explore those in a future post.