Given a string, a substring of it consists of some consecutive characters from it, taken in sequence. Thus "STRING" is a substring of "BIGGER STRING", but "B STRING" & "BIG REG" are not.

There is a notation
called __slicing__ for describing substrings, & this can be applied
to arbitrary string expressions. The general form is

string expression (start **TO** finish)

so that, for instance,

"ABCDEF" (2 **TO** 5) = "BCDE"

If you omit the start, then 1 is assumed; if you omit the finish then the length of the string is assumed. Thus

"ABCDEF" ( **TO** 5) = "ABCDEF" (1 **TO** 5) = "ABCDE"

"ABCDEF" (2 **TO** ) = "ABCDEF" (2 **TO** 6) = "BCDEF"

&

"ABCDEF" ( **TO** ) = "ABCDEF" (1 **TO** 6) = "ABCDEF"

(you can also write this last one as "ABCDEF" (), for what it's worth.)

A slightly different
for misses out the **TO** & just has one number:

"ABCDEF" (3) = "ABCDEF" (3 **TO** 3) = "C"

Although normally both start & finish must refer to existing parts of the string, this rule is overridden by one other: if the start is more than the finish, then the result is the empty string. So

"ABCDEF" (5 **TO** 7)

gives error 3 (subscript error) because, the string only contains 6 characters, & 7 is too many, but

"ABCDEF" (8 **TO** 7) = ""

&

"ABCDEF" (1 **TO** 0) = ""

The start & finish must not be negative, or you get error B.

This next program makes B$ equal to A$, but omitting any trailing spaces.

10 **INPUT** A$

20 **FOR** N=**LEN** A$ **TO** 1 **STEP** -1

30 **IF** A$(N)<>"" **THEN GOTO** 50

40 **NEXT** N

50 **LET** B$=A$( **TO** N)

60 **PRINT** "**""**";A$;"**""**","**""**";B$;"**""**"

70 **GOTO** 10

Note how if A$ is entirely
spaces, then in line 50 we have N = 0 & A$( **TO** N) = A$(1 **TO**
0) = "".

For string variables, we can not only extract substrings, but also assign to them. For instance type

**LET**
A$="LOR LOVE A DUCK"

& then

**LET**
A$(5 TO 8)="******"

&

**PRINT**
A$

Notice how since the
substring A$(5 **TO** 8) is only 4 characters long, only the first four
stars have been used. This is a characteristic of assigning to substrings:
the substring has to be exactly the same length afterwards as it was before.
To make sure this happens, the string that is being assigned to it is cut
off on the right if it is too long, or filled out with spaces if it is
too short - this is called __Procrustean__ assignment after the inn-keeper
Procrustes who used to make sure that his guests fitted the bed by either
stretching them out on a rack or cutting their feet off.

If you now try

**LET**
A$()="COR BLIMEY"

&

**PRINT**
A$;"."

you will see that the same thing has happened again (this time with spaces put in) because A$() counts as a substring.

**LET**
A$="COR BLIMEY"

will do it properly

Slicing may be considered as having priority 12, so, for instance

**LEN**
"ABCDEF"(2 **TO** 5) = **LEN**("ABCDEF"(2 **TO** 5)) = 4

Complicated string expressions will need brackets round them before they can be sliced. For example,

"ABC"+"DEF"(1 **TO** 2) = "ABCDE"

("ABC"+"DEF")(1 **TO** 2) = "AB"

**Summary**

Slicing, using **TO**.
Note that this notation is non-standard.

**Exercises**

1. Some BASICs (__not__ the ZX81 BASIC)
have functions called LEFT$, RIGHT$, MID$ & TL$.

LEFT$(A$,N) gives the substring of A$ consisting of the first N characters.

RIGHT$(A$,N) gives the substring of A$ consisting of the characters from the Nth on.

MID$(A$,N1,N2) gives the substring of A$ consisting of N2 characters starting at the N1th.

TL$(A$) gives the substring of A$ consisting of all its characters except the first.

How would you write
these in ZX81 BASIC? Would your answers work with strings of length 0 or
1?

2. Try this sequence of commands:

**LET**
A$="X*+*Y"

**LET**
A$(2)=**CHR$** 11 [the string quote character]

**LET**
A$(4)=**CHR$** 11

**PRINT**
A$

A$ is now a string with string quotes inside it! So there is nothing to stop you doing this if you are persevering enough, but clearly if you had originally typed

**LET**
A$="X"+"Y"

the part to the right of the equals sign would have been treated as an expression, giving A$ the value "XY".

Now type

**LET**
B$="X""+""Y"

You will find that although A$ & B$ look the same when printed out, they are not equal - try

**PRINT**
A$=B$

Whereas B$ contains
mere quote image characters (with code 192), A$ contains genuine string
quote characters (with code 11).

3. Run this program:

10 **LET** A$="**LEN** **""**ABDC**""**"

100 **PRINT** A$;" = ";**VAL** A$

This will fail because
**VAL**
does not treat the quote image **""** as a string quote.

Insert some extra lines
between 10 & 100 to replace the quote images in A$ by string quotes
(which you must call **CHR$** 11), & try again.

Make the same modifications
to the program in chapter 9, exercise 3, & experiment with it.

4. This subroutine deletes every occurrence of the string "CARTHAGO" from A$.

1000 **FOR** N=1 **TO LEN** A$-7

1020 **IF** A$(N **TO** N+7)="CARTHAGO" **THEN LET** A$(N TO N+7)="********"

1030 **NEXT** N

1040 **RETURN**

Write a program that gives A$ various values (e.g. "DELENDA EST CARTHAGO.") & applies the subroutine.

Previous: Chapter 20 Next: Chapter 22