我有一个包含以下内容的文件:
CREATE TABLE some_name (
fv int,
sv int,
tv int)
CLUSTERED BY (fv,
sv,
tv)
SORTED BY (fv,
sv,
tv) INTO 2 BUCKETS;
-- more text afterwards
例如,我需要确保脚本会删除从 Clustered inclusive 到 Buckets inclusive 的所有单词,但不会删除分号。代码如何实现?我被提供了这个:
start_word = "CLUSTERED"
end_word = "BUCKETS"
result_lines = []
with open(target_file, 'r') as f:
erasing = False
for line in f:
if not erasing and start_word in line:
// begin erasing lines
erasing = True
continue
if erasing and end_word in line:
// finished erasing lines
erasing = False
continue
if erasing:
// we are between the start and end of the section we want to erase
continue
else:
// either we haven't started erasing or we have already finished
result_lines.append(line)
print('\n'.join(result_lines))
但它会删除分号,一般来说,所有与 Clustered 和 Buckets 一致的东西都会被删除。结果应该是这样的:
CREATE TABLE some_name (
fv int,
sv int,
tv int)
-- more text afterwards;
使用正则表达式:
带有正则表达式的示例:
结果: