Lexis in English language corpora | Exams English Language

Jan

Svartvik,

Department

English,

Lund

University,

Sweden

Lexis

English language

corpora

The second

corpus

generation

Many

more years ago than

care to remember,

the

occasion

of my inaugural lecture

Lund University,

spoke with some enthusiasm about the bright future of corpus-based

study

spoken language, what with tape-recorders getting smaller,

and

computers

getting bigger.

1992,

at the

Fifth Euralex Congress

Tampere,

the

future

corpus

linguistics

seems even brighter than

that previous

occasion.

Yet,

while tape-recorders

may indeed

be a bit

smaller

(the

stereo

set,

though, seems colossal compared

to our

gramophone), computers are actually getting smaller

too:

there has been

radical devel-

opment from

the

mainframe

the micro, personal, desktop, laptop, palmtop

and

note-

book.

But not only are computers getting smaller but also faster and cheaper. This fantas-

tic

technological

hardware

development that

we are

witnessing

is of

course only

one

reason for my

belief

that the future of corpus linguistics is even brighter now than

at the

beginning

of the

seventies.

The

best

part

that

the

hardware

also becoming well

matchedby software,

and

software development is indeed crucial if the corpus approach

going

fulfil its promise.

The

meaning of "corpus" as given

most dictionaries is rather vague

and

gives little

indication of bright prospects, for example:

• MACQUARIE DICTIONARY:

"a body of data".

•

COLLWS COBUILD DICTIONARY:

large number of

articles,

books, magazines, etc that

have been deliberately collected together for some purpose".

• LONGMAN DICTIONARY OF CONTEMPORARY

ENGLISH:

collection..

material or infor-

mation

for

study" (New edition,

1987).

• LONGMAN DICTIONARY OF THE ENGLISH LANGUAGE

0^Jew

edition,

1991) is

explicit:

collection

spoken and/or written language

for

scientific

study

word forma-

tion, sentence structure, sounds, etc".

COBUILD

adds

the warning:

formal, technical word" ft>ut, like

LONGMAN,

also gives the

helpful hint that the plural can be either

corpora

corpuses).

AIl of the definitions in these

recent

works

fail

specify "machine-readable", which is ofcourse the current norm and

also

the

topic

this paper,

particular electronic corpora

spoken English.1 Only

LONGMAN

gives

clear indication that there are,

and

should be, corpora

speech

- by

far

the most common use of language and the variety that has too long been neglected

both grammatical

and

lexicographical description.

is not

often that

we can

date

the

beginning

of a new bud on the

linguistic tree

structure,

but

this

indeed possible with corpus linguistics,

least English corpus

linguistics.

It is now getting mature, just over

years

age.

From the humble beginning

engaging only

small number

linguists, corpora have become

"the

flavour

of the

Lexis in English language corpora, Exams of English Language

Related documents

Partial preview of the text

Download Lexis in English language corpora and more Exams English Language in PDF only on Docsity!

Lexis in English language

corpora

1. The second corpus generation

2. Why use a corpus in the first place?

3. Corpora of spoken English

4. Statistical vocabulary studies

Table 1. The 50 m o s t frequent w o r d s In SEC a n d c o m p a r i s o n s w i t h LOB. B r o w n , a n d LLC

24 EURALEX^ '92 -^ PROCEEDINGS

6. Discourse items

ТаЫе 3

ТаЫѳД

[m] [m]

ТаЫѳ 5

8. Semanticfields

9. Collocation

Notes

References

30 EURALEX^ '92 -^ PROCEEDINGS