i have graphing/analysis problem can't quite head around. can brute force, slow, maybe has better idea, or knows or speedy library python?
i have 2+ time series data sets (x,y) want aggregate (and subsequently plot). issue x values across series don't match up, , don't want resort duplicating values time bins.
so, given these 2 series:
s1: (1;100) (5;100) (10;100) s2: (4;150) (5;100) (18;150)
when added together, should result in:
st: (1;100) (4;250) (5;200) (10;200) (18;250)
logic:
x=1 s1=100, s2=none, sum=100 x=4 s1=100, s2=150, sum=250 (note s1 value previous value) x=5 s1=100, s2=100, sum=200 x=10 s1=100, s2=100, sum=200 x=18 s1=100, s2=150, sum=250
my current thinking iterate sorted list of keys(x), keep previous value each series, , query each set if has new y x.
any ideas appreciated!
here's way it, putting more of behaviour on individual data streams:
class datastream(object): def __init__(self, iterable): self.iterable = iter(iterable) self.next_item = (none, 0) self.next_x = none self.current_y = 0 self.next() def next(self): if self.next_item none: raise stopiteration() self.current_y = self.next_item[1] try: self.next_item = self.iterable.next() self.next_x = self.next_item[0] except stopiteration: self.next_item = none self.next_x = none return self.next_item def __iter__(self): return self class mergeddatastream(object): def __init__(self, *iterables): self.streams = [datastream(i) in iterables] self.outseq = [] def next(self): xs = [stream.next_x stream in self.streams if stream.next_x not none] if not xs: raise stopiteration() next_x = min(xs) current_y = 0 stream in self.streams: if stream.next_x == next_x: stream.next() current_y += stream.current_y self.outseq.append((next_x, current_y)) return self.outseq[-1] def __iter__(self): return self if __name__ == '__main__': seqs = [ [(1, 100), (5, 100), (10, 100)], [(4, 150), (5, 100), (18, 150)], ] sm = mergeddatastream(*seqs) x, y in sm: print "%02s: %s" % (x, y) print sm.outseq
Comments
Post a Comment