Combinatronics#

Review: Awkward Arrays#

Before we continue, we need to be sure we properly understand how slicing works in Awkard arrays. As we have seen so far, the slicing is just a generalization of slicing in NumPy.

import awkward as ak

# Example array
array = ak.Array(
    [
        [0.0, 1.1, 2.2], 
        [], 
        [3.3, 4.4], 
        [5.5], 
        [6.6, 7.7, 8.8, 9.9]
    ]
)
array

Exercise

Go over the following examples and try to guess what each of them do before running it. One of them fails. Why?

array[2]
array[-1, 1]
array[2:, 0]
array[2:, 1:]
# Why does this one fail?
array[:, 0]
array[[True, False, True, False, True]]
array > 4
array
array[[2, 3, 3, 1]]
ak.num(array)
ak.num(array) > 0
array[ak.num(array) > 0]
array[ak.num(array) > 0][0]
array[ak.num(array) > 0, 0]
array[ak.num(array) > 0][:, 0]
# A jagged array of booleans!
cut = (array * 10 % 2) == 0
cut
array[cut]

Event and particle level cuts#

Let’s now revisit the data we were working with before and see how we could reconstruct the \(Z\) peak using a more sound approach.

import uproot

file = uproot.open(
    "./uproot-tutorial-file.root"
)
tree = file["Events"]
muons = tree.arrays()

From the last chapter, we saw that the \(Z\) is quite a massive particle. Thus, we can assume that the muons that result from the decay of this particle tend to have larger momentum. So, lets apply suck a requirement. Notice, however, that this is fundamentally different from what we were doing before, where the mask we made for tree["nMuon"] == 2 was flat, and thus applied a selection per-event. Now, were apply a selection per-muon. It might seem at first thought that this would be more difficult, but Awkard makes it easy for us!

# Getting the muon transverse momentum
muonpt = muons["Muon_pt"]
muonpt
# Making the muon transverse momentum mask
ptcut = muonpt > 20 # GeV
ptcut

If we now apply this mask to the muons array, we effectively filter out all muons that have \(p_T \leq 20\text{ GeV}\).

muonpt[ptcut]

Using this mask by itself will leave us with events with no muons at all! So, out of this muon-level mask, lets construct an event-level mask which requires there to be at least one muon with over \(20 \text{ GeV}\) in each event. To do this, we will be the ak.any function. When applied to an array of booleans, it effectively does the OR operation on all of the elements. We specify that we want this to be done per-event, so we include axis=1 in the argument (axis=0 would be the outer-most dimension, axis=1 would be the first innermost dimension, etc.).

event_ptcut = ak.any(ptcut, axis=1)
event_ptcut

Let’s now apply this to the data.

muons_geq2pt = muons[event_ptcut]
muons_geq2pt

Exercise

Create the exact same event_cptut using ak.max. Keep in mind that you will have to use the axis argument similar to how it was done for ak.any.

Combinatronics#

Because we don’t know exactly which process each detected particle originated from (i.e., whether it is part of the signal or background), we must use combinatorics to consider all possible combinations of muons within each event. This combinatorial approach, together with physical constraints such as conservation of charge and energy, allows us to reconstruct the properties of parent particle while helping to suppress background events. Awkward Array provides two key functions for these tasks:

  • ak.cartesian(): Computes the Cartesian product (cross product) of multiple arrays, generating all possible pairs (or tuples) of elements, one from each array. This is useful for pairing different types of particles or objects within events.

  • ak.combinations(): Computes all unique combinations of elements from a single array, sampled without replacement. This is especially useful for finding all possible pairs, triplets, etc., of the same particle type within an event, without repeating the same element.

In the next chapter, we will be applying these functions to our data in order to improve our dimuon mass spectrum. For now, we will finish this chapter by just seeing how these two functions work.

ak.cartesian()#

As the name suggests, ak.cartesian() will perform the cartesian product of an array with another array.

numbers = ak.Array(
    [
        [1, 2, 3], 
        [], 
        [5, 7], 
        [11]
    ]
)

letters = ak.Array(
    [
        ["a", "b"], 
        ["c"], 
        ["d"], 
        ["e", "f"]
    ]
)

pairs = ak.cartesian((numbers, letters))
pairs

To get the first element of each pair, we pass a “string index”.

pairs["0"]
pairs["1"]

Note that this is different from passing an integer index.

pairs[0]

If you want to separate the first element of each pair to an array and the second element of each pair to another array (something which later on we will see is quite useful), we can use ak.unzip.

lefts, rights = ak.unzip(pairs)
print(lefts)
print(rights)

ak.combinations#

ak.combinations generates all unique combinations of a specified size from each sub-array, without repeating elements or considering their order. This is useful for finding all possible pairs, triplets, etc., within each event, allowing you to explore every possible grouping of elements for further analysis.

pairs = ak.combinations(numbers, 2)
pairs

Because the elements in the sub-arrays line up once they are divided using ak.unzip, we can do computations with them

lefts, rights = ak.unzip(pairs)
lefts * rights

Moreover, note that we can change the size of the combinations we allow.

ak.combinations(numbers, 3)

Exercise

See what happens when you increase the size of the combinations to 4. Can you explain the output?