# RegEx

## Python RegEx

A **RegEx**, or **Regular Expression**, is a sequence of characters that forms a search pattern.

RegEx can be used to check if a string contains the specified search pattern.

### RegEx Module

Python has a built-in package called `re`, which can be used to work with Regular Expressions.

Import the re module:

```python
import re
```

### RegEx in Python

When you have imported the re module, you can start using regular expressions:

Example

````python
import re

# Check if the string starts with "The" and ends with "Spain":

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

if x:
  print("YES! We have a match!")
else:
  print("No match")
```

Output
```
YES! We have a match!
````

#### RegEx Functions

The `re` module offers a set of functions that allows us to search a string for a match:

| Function | Description                                                       |
| :------: | ----------------------------------------------------------------- |
|  findall | Returns a list containing all matches                             |
|  search  | Returns a Match object if there is a match anywhere in the string |
|   split  | Returns a list where the string has been split at each match      |
|    sub   | Replaces one or many matches with a string                        |

#### Metacharacters

Metacharacters are characters with a special meaning:

<figure><img src="https://2843449504-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MXS9gjR6N_CAer7Mflk%2Fuploads%2F2BmsIQkk5k9ONgXhoexz%2Fimage.png?alt=media&#x26;token=d6eb1b52-5f0d-469a-8038-7144589c3c9b" alt=""><figcaption></figcaption></figure>

**Example 1: re.findall()**

```python
# Program to extract numbers from a string

import re

string = 'hello 12 hi 89. Howdy 34'
pattern = '\d+'

result = re.findall(pattern, string) 
print(result)

# Output: ['12', '89', '34']
```

If the pattern is not found, re.findall() returns an empty list.

**Example 2: re.split()**

```python
import re

string = 'Twelve:12 Eighty nine:89.'
pattern = '\d+'

result = re.split(pattern, string) 
print(result)

# Output: ['Twelve:', ' Eighty nine:', '.']
```

If the pattern is not found, re.split() returns a list containing the original string.

**Example 3: re.sub()**

```python
# Program to remove all whitespaces
import re

# multiline string
string = 'abc 12\
de 23 \n f45 6'

# matches all whitespace characters
pattern = '\s+'

# empty string
replace = ''

new_string = re.sub(pattern, replace, string) 
print(new_string)

# Output: abc12de23f456
```

If the pattern is not found, re.sub() returns the original string.

{% hint style="info" %}
More details can be found in [Regular expression operations](https://docs.python.org/3/library/re.html) and  [Regular Expression HOWTO](https://docs.python.org/3/howto/regex.html)
{% endhint %}
