5

So my problem is this, I have a file that looks like this:

[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1

This would of course translate to

' This is an example file!'

I am looking for a way to parse the original content into the end content, so that a [BACKSPACE] will delete the last character(spaces included) and multiple backspaces will delete multiple characters. The [SHIFT] doesnt really matter as much to me. Thanks for all the help!

1
  • Are [BACKSPACE] and [SHIFT] the only markups that you need to worry about? Commented Feb 3, 2011 at 3:12

5 Answers 5

1

Here's one way, but it feels hackish. There's probably a better way.

def process_backspaces(input, token='[BACKSPACE]'):
    """Delete character before an occurence of "token" in a string."""
    output = ''
    for item in (input+' ').split(token):
        output += item
        output = output[:-1]
    return output

def process_shifts(input, token='[SHIFT]'):
    """Replace characters after an occurence of "token" with their uppecase 
    equivalent. (Doesn't turn "1" into "!" or "2" into "@", however!)."""
    output = ''
    for item in (' '+input).split(token):
        output += item[0].upper() + item[1:]
    return output

test_string = '[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1'
print process_backspaces(process_shifts(test_string))
Sign up to request clarification or add additional context in comments.

Comments

1

If you don't care about the shifts, just strip them, load

(defun apply-bspace ()
  (interactive)
  (let ((result (search-forward "[BACKSPACE]")))
    (backward-delete-char 12)
    (when result (apply-bspace))))

and hit M-x apply-bspace while viewing your file. It's Elisp, not python, but it fits your initial requirement of "something I can download for free to a PC".

Edit: Shift is trickier if you want to apply it to numbers too (so that [SHIFT]2 => @, [SHIFT]3 => #, etc). The naive way that works on letters is

(defun apply-shift ()
  (interactive)
  (let ((result (search-forward "[SHIFT]")))
    (backward-delete-char 7)
    (upcase-region (point) (+ 1 (point)))
    (when result (apply-shift))))

2 Comments

+1 for an Elisp answer! It's (not too suprisingly, I guess) quite good at this sort of thing... I'm a vim person, personally, but things like this sometimes pull me towards emacs.
@Joe Kington - Hehe. To be truthful, this is the sort of thing I'd handle with a keyboard macro and maybe an alist unless there were multiple, large files that needed parsing. It's just that a function is easier to share and explain.
1

This does exactly what you want:

def shift(s):
    LOWER = '`1234567890-=[];\'\,./'
    UPPER = '~!@#$%^&*()_+{}:"|<>?'

    if s.isalpha():
        return s.upper()
    else:
        return UPPER[LOWER.index(s)]

def parse(input):
    input = input.split("[BACKSPACE]")
    answer = ''
    i = 0
    while i<len(input):
        s = input[i]
        if not s:
            pass
        elif i+1<len(input) and not input[i+1]:
            s = s[:-1]
        else:
            answer += s
            i += 1
            continue
        answer += s[:-1]
        i += 1

    return ''.join(shift(i[0])+i[1:] for i in answer.split("[SHIFT]") if i)

>>> print parse("[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1")
>>> This is an example file!

2 Comments

Oops, I just spotted a bug... sorry. Fixing it now
The debug is complete and the result is exactly what you want
0

It seems that you could use a regular expression to search for (something)[BACKSPACE] and replace it with nothing...

re.sub('.?\[BACKSPACE\]', '', YourString.replace('[SHIFT]', ''))

Not sure what you meant by "multiple spaces delete multiple characters".

4 Comments

-1 How will this work for "blah[BACKSPACE][BACKSPACE][BACKSPACE]arf"?
But it needs to delete one space BEFORE the backspace as well as the '[BACKSPACE]' itslef
That's my point -- gahooa's solution won't work for my blah-barf example.
Yeah, i just saw what you were saying, so far the only way I can think of to do it would combine python with autoit or another manual macro/automation service, but the results would be tedious at best, and possibly not 100% functioning
0

You need to read the input, extract the tokens, recognize them, and give a meaning to them.

This is how I would do it:

# -*- coding: utf-8 -*-

import re

upper_value = {
    1: '!', 2:'"',
}

tokenizer = re.compile(r'(\[.*?\]|.)')
origin = "[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1"
result = ""

shift = False

for token in tokenizer.findall(origin):
    if not token.startswith("["):
        if(shift):
            shift = False
            try:
                token = upper_value[int(token)]
            except ValueError:
                token = token.upper()

        result = result + token
    else:
        if(token == "[SHIFT]"):
            shift = True
        elif(token == "[BACKSPACE]"):
            result = result[0:-1]

It's not the fastest, neither the elegant solution, but I think it's a good start.

Hope it helps :-)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.