Tuesday, May 8, 2012

Flexible sliding window (in Python)

Problem description: I'm interested in looking at terms in the text window of, say, 3 words to the left and 3 to the right. The base case has the form of w-3 w-2 w-1 term w+1 w+2 w+3. I want to implement a sliding window over my text with which I will be able to record the context words of each term. So, every word is once treated as a term, but when the window moves, it becomes a context word, etc. However, when the term is the 1st word in line, there are no context words on the left (t w+1 w+2 w+3), when it's the 2nd word in line, there's only one context word on the left, and so on. So, I am interested in any hints for implementing this flexible sliding window (in Python) without writing and specifying separately each possible situation.

To recap:

Example of input:

["w1", "w2", "w3", "w4", "w5", "w6", "w7", "w8", "w9", "w10"]


t1 w2 w3 w4

w1 t2 w3 w4 w5

w1 w2 t3 w4 w5 w6

w1 w2 w3 t4 w5 w6 w7

__ w2 w3 w4 t5 w6 w7 w8

__ __ etc.

My current plan is to implement this with a separate condition for each line in the output.

