Finding groups of consecutive integers in an array with Python

Table of Contents
Today’s article isn’t a trading tutorial per se, but it’s a helper function that exists in my personal fintech library that I developed for myself.
This method is incredibly useful when you want to check how long a particular trend has been going on (i.e., number of days stock has been trading above a key support level, etc.)
We’ll be using it in our Part 7 of this series for market stage detection.
This tutorial is part 5 in a larger series on getting started with fintech and market analysis with Python:
- How to download market data with yfinance and Python
- Rethinking yfinance’s default MultiIndex format
- How to plot candlestick charts with Python and mplfinance
- How to compute Simple Moving Averages (SMAs) for trading with Python and Pandas
- Finding consecutive integer groups in arrays with Python and NumPy (this tutorial)
- Computing slope of series with Pandas and SciPy
- Market stage detection with Python and Pandas
- Implementing TradingView’s Stochastic RSI indicator in Python
- Introduction to position sizing
- Risk/Reward analysis and position sizing with Python
Configuring your development environment #
Before we dive in, let’s set up our Python environment with the packages we’ll need:
$ pip install numpy pandas
Implementing our method to find consecutive groups of integers #
Let’s get started by importing the necessary packages:
# import the necessary packages
from typing import Union
from typing import Sequence
from typing import Tuple
from typing import List
import numpy as np
I’m a huge fan of typed annotations in Python, especially when defining classes, methods, and functions.
Typed annotations make your code cleaner, and if you use an IDE, it helps the IDE provide more intelligent code suggestions and warnings to you.
We’re importing several typing annotations to create a properly typed function:
Union
: Allows us to specify multiple possible types for a parameterSequence
: A generic type representing any sequence (list, tuple, etc.)Tuple
andList
: Type hints for our return valuesnumpy
: So we can access theArrayLike
annotation
Now, let’s define our find_consecutive_integers
function:
def find_consecutive_integers(
idxs: Union[np.typing.ArrayLike, Sequence[int]],
min_consec: int,
start_offset: int = 0
) -> List[Tuple[int, int]]:
# check to see if the indexes input is empty
if len(idxs) == 0:
# return an empty list
return []
Let’s break down this function step by step, starting with the input parameters:
idxs
: The array of integers we’re searching through for consecutive sequences. This parameter accepts either NumPy arrays or Python sequences like-lists.min_consec
: The minimum number of consecutive integers required to consider a group valid.start_offset
: An optional offset to apply to the returned indices (defaults to0
).
The return value from find_consecutive_integers
will be a list of tuples, where each tuple contains the (start, end)
indices of a consecutive group that meets our minimum size criteria
The if
statement at the top of the function handles empty inputs by simply returning an empty list. This is a defensive programming technique to avoid errors when processing empty data.
Next, we ensure our input is a NumPy array (converting it if needed) and initialize an empty list to store our valid consecutive groups:
# ensure the indexes are an array, then initialize a list to store the
# groups
idxs = np.array(idxs)
groups = []
Now comes the clever part — finding the boundaries between consecutive sequences:
# find boundaries in consecutive sequences where the difference between
# consecutive elements is *not* one, then add in the start and ending
# indexes to the boundaries
boundaries = np.where(np.diff(idxs) != 1)[0] + 1
boundaries = np.concatenate(([0], boundaries, [len(idxs)]))
This code does two important things:
np.diff(idxs)
calculates the difference between adjacent elements in our arraynp.where(np.diff(idxs) != 1)
finds positions where the difference is not 1, which indicates a break in consecutive integers- The
+ 1
shifts these indices forward by one to mark the start of a new sequence - Finally, we add
0
at the beginning ofboundaries
andlen(idxs)
at the end ofboundaries
, ensuring we capture the first and last sequences properly
These boundary additions are crucial because without adding 0
, we wouldn’t have a marker for the start of the first sequence, and, similarly, without adding len(idxs)
, we wouldn’t have a marker for the end of the last sequence.
These added boundary markers allow our algorithm to process every sequence in the array, including those at the very beginning and very end.
Next, we loop through each boundary pair to extract consecutive sequences:
# loop over the boundary ranges
for i in range(0, len(boundaries) - 1):
# grab the start and end index of the boundary
start_idx = boundaries[i]
end_idx = boundaries[i + 1] - 1
# check to see if the length of the group is greater than our minimum
# threshold
if end_idx - start_idx + 1 >= min_consec:
# update the list of groups
groups.append((
int(idxs[start_idx]) + start_offset,
int(idxs[end_idx]) + start_offset
))
For each pair of boundaries, we:
- Calculate the start and end indices
- Check if the sequence is long enough (meets our
min_consec
requirement) - If it is, append the actual integer values (with any offset applied) to our result list
Verifying our consecutive grouping method is working correctly #
Now, let’s test our function with an array containing various patterns of consecutive and non-consecutive integers:
# define an array of consecutive integers
test_array = np.array([
3, 4, 5, 6, # 4 consecutive integers
9, 10, # 2 consecutive integers
15, 16, 17, 18, 19, # 5 consecutive integers
25, # single value
30, 31, 32, # 3 consecutive integers
40, 42, 44, # non-consecutive integers
50, 51, 52, 53 # 4 consecutive integers
])
This test array contains:
- A sequence of 4 consecutive integers (3-6)
- A sequence of 2 consecutive integers (9-10)
- A sequence of 5 consecutive integers (15-19)
- A single value (25)
- A sequence of 3 consecutive integers (30-32)
- Three non-consecutive integers (40, 42, 44)
- A sequence of 4 consecutive integers (50-53)
Let’s see if our function correctly identifies sequences with at least 3 consecutive integers:
# find sequences with at least three consecutive groups
find_consecutive_integers(test_array, min_consec=3)
Which outputs:
[(3, 6), (15, 19), (30, 32), (50, 53)]
Perfect! Our function found exactly the four sequences we’d expect:
- The numbers 3 through 6
- The numbers 15 through 19
- The numbers 30 through 32
- The numbers 50 through 53
Now, what if we increase our minimum consecutive threshold to 4?
# find sequences with at least four consecutive groups
find_consecutive_integers(test_array, min_consec=4)
Which gives us:
[(3, 6), (15, 19), (50, 53)]
Again, this matches our expectations. The sequence (30, 32)
drops out because it only contains 3 numbers.
Let’s increase the threshold to 5:
# find sequences with at least five consecutive groups
find_consecutive_integers(test_array, min_consec=5)
Note how the size of the output list is smaller than the previous examples:
[(15, 19)]
Now we’re left with only the sequence (15-19), which contains 5 consecutive integers.
If we set a threshold higher than any sequence in our array:
# find sequences with at least ten consecutive groups
find_consecutive_integers(test_array, min_consec=10)
Then we would expect to get an empty list in return:
[]
And finally, let’s verify that our function handles empty inputs correctly:
# verify that passing in an empty list also returns empty groups
find_consecutive_integers([], min_consec=3)
And, again, we receive an empty list:
[]
Our function correctly returns an empty list when given empty input.
Exercises #
Before we wrap up, try these exercises to reinforce what you’ve just learned:
Length calculation: Modify the function to also return the length of each consecutive sequence. This will save you from having to calculate lengths separately when analyzing trend durations.
Date sequence detection: Implement a version that works with
datetime
objects to find consecutive dates. This is particularly useful for analyzing trading days where certain conditions persist.Trend duration analysis: Use the function on real stock data to identify consecutive days where the price closed above a moving average. How many “winning streaks” did your favorite stock have last year? (hint: use this tutorial to help you get started with moving averages)
Custom sequence steps: Extend the function to handle non-integer step sizes (e.g., sequences increasing by 0.5 or 2) to make it more versatile for different types of data patterns.
Visualization enhancement: Create a visualization that highlights the consecutive sequences in a time series chart. Can you color-code chart regions based on the length of consecutive patterns?
Final thoughts #
In this tutorial, we’ve built a robust function to identify groups of consecutive integers in arrays. This seemingly simple tool has powerful applications in financial analysis, especially for identifying persistent trends or conditions in market data.
The find_consecutive_integers
function leverages NumPy’s vectorized operations to efficiently process even large arrays, making it suitable for working with extensive historical market data.
By identifying sequences of consecutive integers, we can detect when certain market conditions persist for meaningful periods — a key part of many trading strategies.
In the next tutorial, we’ll learn how to compute the slope of time series data using Pandas and SciPy, another simple (yet essential) tool for quantitative trading.