#P3203. Incremental Python Parsing

Incremental Python Parsing

Description

For the sake of simplicity, only the following subset of Python syntax is considered:

PLAIN_TEXT:
    [^"()\[\]{}]+

STRING:
    \"[^"]*\"

text:
      (empty)
    |
text PLAIN_TEXT
    |
text STRING
    |
text bracket_enclosed_text
bracket_enclosed_text:
      ( text )
    | [ text ]
    | { text }

From the syntax some terms are defined as follows:

  1. String: a sequence delimited by quotes (") of characters other than quotes, possibly including new-lines.

    Examples:
      "a(bc"
      "a
    bc)"
      "usage: foo [bar]
    this help information
    is
    totally
    useless"

    (Python actually """ for such strings, but let’s accept the modification just for this problem.)

  2. Bracket-enclosed text: a sequence of characters other than brackets or quotes enclosed by matched pairs of brackets.

    Example:
      (1)
      {a->[foo.bar(
      "a[",b,c
      )]}

  3. Logical new-line: a new-line character that is not part of a string or bracket-enclosed text

  4. Physical line: a maximal sequence of characters other than new-line characters

    Examples:
    def a(b)    # physical line 1
      foo.bar( 
    # physical line 2
      "a[",b)  
    # physical line 3
    "blah blah,
    # physical line 4
    blah"      
    # physical line 5

  5. Logical line: a maximal sequence of characters other than logical new-lines

    Examples:
    def a(b)    # logical line 1
      foo.bar( 
    # logical line 2
      "a[",b)  
    # logical line 2
    "blah blah,
    # logical line 3
    blah"      
    # logical line 3 

  6. Valid/invalid program: a program without/with mismatched quotes or brackets that are not part of any strings 

    Examples:
      def a(b
      "blah

Given a program, you are to implement the following primitive operations:

  • modify(a, b, c): Modify the character at physical line a, column b to c
  • query(a, b): Tell which logical line the character at physical line a, column b is on.

Input

The given program appears first in the input followed by a line with a single #. Then come the operation descriptions. Each operation is described in the format c a b, where c is a character, a a physical line number and b a column number. Both physical line and column numbers are counted starting from 1. If c is ?, the operation will be query(a, b), otherwise it will be modify(a, b, c). The program can be invalid while being edited, but whenever a query operation is demanded, it will always be valid. A modify operation will neither remove existing spaces and new-lines in the program nor introduce new ones. Tabs are not present in the input.

The program contains up to 106 characters (including new-lines) and 105 physical lines each consisting of up to 80 characters. And there will be at most 105 operations.

Output

For each query operation, output a line containing the logical line number.

aaa
(bb
ccc
dd)
fff
#
? 1 1
? 5 1
b 2 1
d 4 3
? 5 1
" 1 3
" 5 2
? 5 1
a 1 3
f 5 2
? 5 1
1
3
5
1
5

Source

POJ Monthly--2007.03.04, Hou, Qiming