我有几个加权间隔的列表 - 这些是元组(value, start, stop)
:
# 0 1 2 3 4 5 6 7 8 9 10
# 1 [___]
# 2 [___________________________________]
# 3 [_______]
# 8 [_______________________]
# 7 [___________________]
# 6 [___]
# 4 [_______________________]
# input
# (value: float, start: float, stop: float)
list_in = [(2, 0, 9), (3, 0, 2), (8, 1, 7), (4, 1, 7), (1, 3, 4), (5, 5, 10), (6, 8, 9)]
我想找到所有部分重叠并选择“最佳”重叠,即 最大的重叠value
。
对于我的示例,正确答案是:
# 0 1 2 3 4 5 6 7 8 9 10
# 1
# 2
# 3 [___]
# 8 [_______________________]
# 7 [___________]
# 6
# 4
# output
list_out = [(3, 0, 1), (8, 1, 7), (7, 7, 10)]
我会尽量保持简单,说明我所拥有的所有条件,但以下是您应该了解的最低限度:
- 输入的元组数组未以任何方式排序
- 我有一个区间树的实现,如果算法性能更高或更简单,请使用它
3)需要考虑-float('inf')
区间的开始和float('inf')
区间的结束。
- 如果两个间隔的值
value
相同,则越短越好。如果它们的长度相同,则它是间隔的精确副本 - 忽略精确副本。 - 忽略具有的间隔
start == stop
如果这对您来说更容易,以下是相对间距的所有可能情况:
# Target interval:
# 0 1 2 3 4 5 6 7 8 9
# [___________]
# All cases for target:
# 0 1 2 3 4 5 6 7 8 9
# [_______]
# [_______]
# [_______]
# [_______]
# [___]
# [_______]
# [_______]
# [_______]
# [___________________]
# [___________]
区间和点的所有情况:
value, start, stop = interval
point = x
point < start
point == start
start < point and point < stop
point == stop
stop < point
我正面解决这个问题的任何尝试都会导致组合爆炸并将代码变成回调地狱
请帮忙。PS:我可以粘贴我的尝试,但它需要 200 行代码,并且开始如下:
def calc_factor(integral_value: float, depth_start: float, depth_stop: float):
return integral_value / (depth_stop - depth_start)
def split_by_intersections(interval_l: IntegralValueInterval, interval_r: IntegralValueInterval):
assert interval_l.depth_start < interval_l.depth_stop
assert interval_r.depth_start < interval_r.depth_stop
assert interval_l.depth_start > -float('inf')
assert interval_l.depth_stop < float('inf')
assert interval_r.depth_start > -float('inf')
assert interval_r.depth_stop < float('inf')
# if-hell I know this, but I don't know how to simplify it.
if interval_l.depth_start < interval_r.depth_start:
if interval_l.depth_stop < interval_r.depth_start:
# L.start L.stop R.start R.stop
return [interval_l, interval_r]
elif interval_l.depth_stop == interval_r.depth_start:
# L.start (L.stop R.start) R.stop
interval_l.depth_stop -= settings.gis_delta_depth
return [interval_l, interval_r]
elif interval_l.depth_stop > interval_r.depth_start and interval_l.depth_stop < interval_r.depth_stop:
# L.start R.start L.stop R.stop
a = interval_l.depth_start
b = interval_r.depth_start
c = interval_l.depth_stop
d = interval_r.depth_stop
interval_l.depth_start = a
interval_l.depth_stop = b - settings.gis_delta_depth
interval_c = IntegralValueInterval()
interval_c.depth_start = b
interval_c.depth_stop = c - settings.gis_delta_depth
left_factor = calc_factor(interval_l.integral_value, interval_c.depth_start, interval_c.depth_stop)
right_factor = calc_factor(interval_r.integral_value, interval_c.depth_start, interval_c.depth_stop)
if left_factor < right_factor:
interval_c.integral_value = interval_r.integral_value
else:
interval_c.integral_value = interval_l.integral_value
interval_r.depth_stop = d
interval_r.depth_start = c
return [interval_l, interval_c, interval_r]
elif interval_l.depth_stop == interval_r.depth_stop:
# L.start R.start (L.stop R.stop)
left_factor = calc_factor(interval_l.integral_value, interval_l.depth_start, interval_l.depth_stop)
right_factor = calc_factor(interval_r.integral_value, interval_r.depth_start, interval_r.depth_stop)
if left_factor < right_factor:
interval_l.depth_stop = interval_r.depth_start - settings.gis_delta_depth
else:
interval_r.depth_start = interval_l.depth_stop
interval_l.depth_stop -= settings.gis_delta_depth
return [interval_l, interval_r]
...
PS 2.0:我尝试为所有情况编写测试,发现不可能考虑到-float('inf')
,float('inf')
因为存在矛盾的例子,例如,[(2, -float('inf'), 6), (1, -float('inf'), 2)]
不清楚哪个段更好。因此,我正在修改这个问题。
PS 3.0:我在代码中附上了“测试”,以便您可以检查您的解决方案。
def test():
for intervals, expected in (
# value left > value right
([(2, 3, 6), (1, 0, 2)], [(1, 0, 2), (2, 3, 6)]),
([(2, 3, 6), (1, 1, 3)], [(1, 1, 3), (2, 3, 6)]),
([(2, 3, 6), (1, 2, 4)], [(1, 2, 3), (2, 3, 6)]),
([(2, 3, 6), (1, 3, 5)], [(2, 3, 6)]),
([(2, 3, 6), (1, 4, 5)], [(2, 3, 6)]),
([(2, 3, 6), (1, 3, 6)], [(2, 3, 6)]),
([(2, 3, 6), (1, 2, 6)], [(1, 2, 3), (2, 3, 6)]),
([(2, 3, 6), (1, 2, 7)], [(1, 2, 3), (2, 3, 6), (1, 6, 7)]),
([(2, 3, 6), (1, 3, 7)], [(2, 3, 6), (1, 6, 7)]),
([(2, 3, 6), (1, 4, 6)], [(2, 3, 6)]),
([(2, 3, 6), (1, 5, 7)], [(2, 3, 6), (1, 6, 7)]),
([(2, 3, 6), (1, 6, 8)], [(2, 3, 6), (1, 6, 8)]),
([(2, 3, 6), (1, 7, 9)], [(2, 3, 6), (1, 7, 9)]),
# value left == value right
([(2, 3, 6), (2, 0, 2)], [(2, 0, 2), (2, 3, 6)]), # 2 / (6 - 3) < 2 / (2 - 0)
([(2, 3, 6), (2, 1, 3)], [(2, 1, 3), (2, 3, 6)]), # 2 / (6 - 3) < 2 / (3 - 1)
([(2, 3, 6), (2, 2, 4)], [(2, 2, 4), (2, 4, 6)]), # 2 / (6 - 3) < 2 / (4 - 2)
([(2, 3, 6), (2, 3, 5)], [(2, 3, 5), (2, 5, 6)]), # 2 / (6 - 3) < 2 / (5 - 3)
([(2, 3, 6), (2, 4, 5)], [(2, 3, 4), (2, 4, 5), (2, 5, 6)]), # 2 / (6 - 3) < 2 / (5 - 4)
([(2, 3, 6), (2, 3, 6)], [(2, 3, 6)]), # 2 / (6 - 3) == 2 / (6 - 3) (copy)
([(2, 3, 6), (2, 2, 6)], [(2, 2, 3), (2, 3, 6)]), # 2 / (6 - 3) > 2 / (6 - 2)
([(2, 3, 6), (2, 2, 7)], [(2, 2, 3), (2, 3, 6), (2, 6, 7)]), # 2 / (6 - 3) > 2 / (7 - 2)
([(2, 3, 6), (2, 3, 7)], [(2, 3, 6), (2, 6, 7)]), # 2 / (6 - 3) > 2 / (7 - 3)
([(2, 3, 6), (2, 4, 6)], [(2, 3, 4), (2, 4, 6)]), # 2 / (6 - 3) < 2 / (6 - 4)
([(2, 3, 6), (2, 5, 7)], [(2, 3, 5), (2, 5, 7)]), # 2 / (6 - 3) < 2 / (7 - 5)
([(2, 3, 6), (2, 6, 8)], [(2, 3, 6), (2, 6, 8)]), # 2 / (6 - 3) < 2 / (8 - 6)
([(2, 3, 6), (2, 7, 9)], [(2, 3, 6), (2, 7, 9)]), # 2 / (6 - 3) < 2 / (9 - 7)
# value left < value right
([(2, 3, 6), (3, 0, 2)], [(3, 0, 2), (2, 3, 6)]),
([(2, 3, 6), (3, 1, 3)], [(3, 1, 3), (2, 3, 6)]),
([(2, 3, 6), (3, 2, 4)], [(3, 2, 4), (2, 4, 6)]),
([(2, 3, 6), (3, 3, 5)], [(3, 3, 5), (2, 5, 6)]),
([(2, 3, 6), (3, 4, 5)], [(2, 3, 4), (3, 4, 5), (2, 5, 6)]),
([(2, 3, 6), (3, 3, 6)], [(3, 3, 6)]),
([(2, 3, 6), (3, 2, 6)], [(3, 2, 6)]),
([(2, 3, 6), (3, 2, 7)], [(3, 2, 7)]),
([(2, 3, 6), (3, 3, 7)], [(3, 3, 7)]),
([(2, 3, 6), (3, 4, 6)], [(2, 3, 4), (3, 4, 6)]),
([(2, 3, 6), (3, 5, 7)], [(2, 3, 5), (3, 5, 7)]),
([(2, 3, 6), (3, 6, 8)], [(2, 3, 6), (3, 6, 8)]),
([(2, 3, 6), (3, 7, 9)], [(2, 3, 6), (3, 7, 9)]),
):
actual = list(max_values(intervals))
if actual != expected:
print(intervals, expected, actual)
test()
问题通过扫地就解决了。事件是间隔的结束。事件队列是一个已排序的数组;如果各段的开始时间相等,则这些段会排到末尾。状态:覆盖给定点的线段集。集合的结构是支持删除元素的最大值堆。
事件处理:
要打印结果,将存储当前最大值的开头(如果有)和当前最大值本身。如果在此事件中最大值发生变化,
我用原始数据来展示它。
步骤 1-1。收集积分。之后我们得到一个数组
步骤 1-2。转换为最小间隔。我们得到一套
步骤 2. 对于每个区间,选择与其重叠的初始区间,收集权重,选择最大值
步骤 3. 如果相邻区间的权重相等,我们将它们合并。我们得到