Data Structures

C-style arrays

Basic Info

This is the normal type of arrays that you've seen since starting coding. Look at some of its features.

int arr[3]; // Declaring an array without initizialization "Be aware of garbage"
int arr[3] = {0}; // Declaring an array with 3 zeroes
int arr[] = {0, 1, 2}; // Declaring an array with 3 different elements 
// Notice that you don't have to add the size
int arr[]; // Doesn't work

Efficiency

Initializing an array

To initialize an array of length $n$ , you have to do $O (n)$ operations. To better understand this, imagine making an array and looping over it to make each element equal to a certain element.

Accessing a certain element of an array

To access a certain element of an array, that requires a constant number of operations.

In order to understand how accessing works, we need to have a look at the concepts of pointers.

Pointers

Whenever we declare a variable, we do the following:

Find a block of memory of size equal to the size of a variable.
Know the exact position of that block of memory.
Save the value of that variable in the variable name.

Let's investigate the following:

int num = 5;
cout << num << endl;

At first, we have booked a block of memory of size equal to $4$ bytes (the size of an integer) and put the value of 5 in it. After that, we attached that value to the variable's name and output it by using the normal cout.

Try this block of code.

int num = 3; // Normal integer
int *pos = &num; // Notice the aestrix and the and sign.
cout << pos << endl; // Output is something like 0x7fff2fbf62dc
cout << *pos << endl; // The output is 3
cout << *(&num) << endl; // the output is 3.

This time we booked a place of a normal variable. Then, we had a weird syntax: * after the int, which means that this a pointer variable that saves the address of a variable in the memory. After that we noticed the &, the reference operator. The reference operator gets the address of a variable in a memory. The output of the last sentence is a hexadecimal code that changes by running the program multiple times. That change in the address of a variable is because of the booking of a normal block of memory for the storage of the variable. Notice that the aestirx * gives the value at a certain address in a memory, where the value $3$ is saved in the address of the pos variable. Furthermore, the last expression is the same as outputting the name of the variable alone as the & gets the position of the element and the * gets the value at the position.

Look at this even more-terrifying code.

int arr[5] = {0}; // an array of five elements
arr[0] = 1; // assigning the first element to 1.
arr[1] = 4; // assigning the second element to 2.
arr[2] = 6; // assigning the third element to 3.
int* pos_arr = arr; // position of the array
int* pos_one = &arr[0]; // position of the first element of an array
cout << *pos_arr << endl; // element at this address.
cout << *pos_one << endl; // element at the start of the address.
cout << *(pos_one + 1) << endl; // second element of the array.
cout << *(pos_arr + 1) << endl; // third element of the array.

At first, we initialized an array (contiguous block of memory of integers) of five zeroes; after that, we assigned the first 3 elements of the array: 1, 4, and 6. I saved the position of both the first element of the array and the array itself in two variables. Then, I outputted the element at both the element at the address of the array and the address of the first element of the array, and they are the same. That gives us the reason of the constant time processing of the array.

The array saves the element in a contiguous block of memory.
The array saves the position of the first element only, and whenever we access the $a rr [i]$ , we are accessing the element at address $a rr + i$ .

STL-arrays

Intro

Knowing that another implementation of the array exists in STL astonished me; however, most competitive programmers do not use it on codeforces and other websites as you cannot make an array of a length that is not const int. I wanted to introduce it to you just for the sake of being familiar to it.

// Make an array of length five
// You cannot make one of length n
array <int, 5> a;
a[0] = 2; // Adding values (could be done using a loop)
a[1] = 3;
a[2] = 4;
a[3] = 6;
a[4] = 7;
for (int i = 0; i < 5; i++)
{
    cout << a[i] << endl; // Outputing values
}

The only way to make the length of an array a variable is to use a const int

const int MAX = 500;
array <int, MAX> arr;

Dynamic Arrays

Preliminary info

Imagine that you have an array of length $1$ . How can you add an element to it?

One could say that we can book an adjacent block of memmory and use that. That seems like a reasonable answer, but if the adjacent block of memory is already used?
Another solution is to make an array of the length + 1, and copy the first array into it. Then, add the new element and delete the first array. That seems good and will work; however, if we add $n$ elements to the array, it would be too much.

The Optimization

Whenever we have an array of a certain length, we will create an array of double the length and copy the contents of the first array into it. Then add the new element and delete the original array. That makes the needed number of copying smaller. That is how a dynamic array work. Dynamic Array is implemented in c++ as a STL container called a vector. It works the same way as an array but has some additional methods.

Good Question

If we have a dynamic array of size $4$ and we add a single element. How does the dynamic array handle that? The dynaimc array will multiply its capacity, raising it to $8$ . After that there comes the concept of iterators: every container in cpp contains two iterators, called container.begin and container.end. The .begin iterator points to the first element while the .end one points to the block of memory after the last element of the container.

How to loop on a cpp container

The `auto` keyword

The word auto can be used instead of any type.

auto num = 5;
int num2 = 5; 
// Those two lines are equivalent

Method one: use iterators

Declare an iterator as the position of the first element; then, keep incrementing the iterator, and once the end of the vector is reached, stop looping.

vector <int> s;
// Here, `auto` replaces `vector <int> :: iterator`
for (auto it = s.begin(); it != s.end(); it++)
{
    cout << *(it) << endl;
}

Method two: use range looping

vector <int> arr(5, 6);
for (auto ele : arr)
{
    cout << ele << endl;
}

CPP vector

vector <int> arr(5); // array of size 5 - O(n)
vector <int> arr2(5, 2) // array of size 5 initialized with 2 - O(n)
vector <int> arr3; // array of size 0
arr.push_back(3); // array got an additional element - Amortized O(1)
arr.pop_back(); // array removed its last element - O(1)
arr2.erase(arr2.begin()); // array removed its first element - O(n)
cout << arr2.size() << endl; // contant time - subtracts the distance between
                            // the .begin and .end iterators.
sort(arr2.begin(), arr2.end()); // sorts the arr2 ascendingly O($n \log n$)
reverse(arr2.begin(), arr2.end()); // vector is now reversed (the first became last) O(n)
sort(arr2.rbegin(), arr2.rend()); // sorted descendingly

Pair

Imagine that you have to store two variables together, like someone's name and their age.

pair <string, int> student; // empty pair {"", 0}
student.first = "Sadek"; // pair -> {"Sadek", 0}
student.second = 16; // pair -> {"Sadek", 16}
pair <string, int> student2 = {"Ahmed", 14};
pair <string, int> student3 = make_pair("Ahmed", 14);

Note that you can make a vector of pairs and sort it.

vector <pair <string, int>> students(3);
students[0] = {"Sadek", 100};
students[1] = {"ahmed", 1000};
students[2] = {"nader", 100};
sort(students.begin(), students.end());
for (int i = 0; i < 3; i++)
{
    cout << students[i].first << " " << students[i].second << endl;
    // ahmed 1000, nader 100, sadek 100  vectors are sorted 
    // ascendingly by the first parameter then the second one
}

Binary Search on Vectors

In the logarithmic time session, we have investigated the binary search technique that searched for an element in a logarithmic time with the condition of having the vector sorted. The same technique goes here for vectors; however, c++ gives us some functions that make the program easier.

lower_bound

The lower_bound function gives an iterator to the first element that is equal to or greater than the element that we are searching for.

// Notice that the array is sorted
vector <int> v = {1, 2, 2, 2, 3};
// we can subtract vectors to knwo the distance between them
cout << lower_bound(v.begin(), v.end(), 2) - v.begin() << endl; // ans is 1, referring to the 2 at 2nd position
// that's because 2 is the first element that is equal to or more than 2.

upper_bound

The upper_bound function gives an iterator to the first element that is greater than the element that we are searching for.

// Notice that the array is sorted
vector <int> v = {1, 2, 2, 2, 3};
// we can subtract vectors to knwo the distance between them
cout << lower_bound(v.begin(), v.end(), 2) - v.begin() << endl; // ans is 4, referring to the 3 at 5th position
// that's because 3 is the first element that is more than 2.

binary_search

The binary_search function gives gives 1 or 0 according to whether the element that we are searching for is in the vector or not.

// Notice that the array is sorted
vector <int> v = {1, 2, 2, 2, 3};
// we can subtract vectors to knwo the distance between them
cout << binary_search(v.begin(), v.end(), 2) << endl; // ans is 1 as 2 is in vector.
cout << binary_search(v.begin(), v.end(), 5) << endl; // ans is 0 as 5 isn't in vector.

equal range

The equal_range function gives gives a pair of two iterators: the first is the result of calling lower_bound and the second is the result of calling upper_bound.

// Notice that the array is sorted
vector <int> v = {1, 2, 2, 2, 3};
// we can subtract vectors to knwo the distance between them
cout << equal_range(v.begin(), v.end(), 2).first - v.begin() <<
" " << equal_range(v.begin(), v.end(), 2).s - v.begin() << endl; // ans is "1 4", which refers 
                                                                // to the first 2 and the first 3

Non-linear DataStructues

Sets

Sets in C++ are something like a vector but with logarithmic time. That logarithmic time is because it sorts things ascendingly. See this:

set <int> s;
s.insert(5);
s.insert(4);
s.insert(3); // Added 5, 4, and 3 to the set
cout << *(s.begin()) << endl; // the first (i.e. smallest) element in the set: 3.

Can we adjust this to do the descending. Just pass a different comparator function.

set <int, greater<int>> s;
s.insert(5);
s.insert(4);
s.insert(3); // Added 5, 4, and 3 to the set
cout << *(s.begin()) << endl; // the first (i.e. greatest) element in the set: 5.

Sets keep elements unique. Let's investigate this:

set <int> s;
s.insert(5);
s.insert(4);
s.insert(4);
s.insert(3); // Added 5, 4, and 3 to the set
cout << (int)s.size() << endl; // size is 3 as 4 is saved only once.

Multisets

Multisets are the same as sets; however, they support multiple sets

multiset <int> multi;
multi.insert(3);
multi.insert(3);
multi.insert(3);
multi.insert(3);
cout << (int)multi.size() << endl; // will give 4 instead of 1

find function

We can use the set.find() or multiset.find() function to know whether an element in a set or not. The function returns an iterator; there are two case:

The iterator is equal to the set.end() or multiset.end(): element doesn't exist.
Otherwise, the iterator points to the element.

The function runs in O( $lo g n$ ).

Binary Search

In sets and multisets, you can use binary search the same way as vectors using the following functions:

s.upper_bound(5); // first element greater than 
s.lower_bound(4); // first element greater than or equal

Unordered sets

Unordered sets are like sets, but they don't sort the elements (i.e. keeps elements unique only). The support find() and insert() in constant time. They don't support any binary search operations at all.

Maps

Maps are like a set of pairs. Imagine that you wanted to save the ages of the friends with their numbers. You could use a pair; however, if you wanted to know the age of Mohammed, you would have two options:

Do linear search: in O( $n$ ) time.
Do Binary search: in O( $lo g n$ ) time, but you need to sort at first in O( $n lo g n$ ) time.

Maps make the whole process easier. see the following:

map <string, int> m;
m["Mohammed"] = 5;
m["Ahmed"] = 4;
cout << m["Ahmed"] << endl; // Output is 4

Maps are like sets in that they use logarithmic time in inserting elements, find() operation, binary search, and even outputing operations. Moreover, maps must have unique keys and are sorted.

Multimaps

They are exactly like maps; however, they support non-unique keys, and they are sorted. They have a different way to insert elements and don't support .find() operation, just loop over them. They support binary search operations supported by map.

multimap <string, int> m;
m.insert({"Ahmed", 4});
m.insert({"Mohammed", 5});

unordered_maps

Unordered maps don't order elements and, consequently, don't support binary search; however, they do support insertion and accessing in constant time, althought with a high constant factor. They have unique keys.

unordered_map <string, int> m;
m["Mohammed"] = 5;
m["Ahmed"] = 4;
cout << m["Ahmed"] << endl; // Output is 4

Bitsets

Bitset can support most binary operations and can represent a quick way of converting an element to its binary representation and looping over it.

bitset <8> b; // A bitset with 8  bits, holds numbers up to (2 ^ 8) - 1.
b = 5;
cout << b << endl; // 00000101
cout << (bitset<10>) 1000 << endl; // casting to a bitset: 1111101000
for (int i = 0; i < 8; i++)
{
    cout << b[i] << endl; // output bits in b.
}