Dynamic 2D Arrays (Continued)
Vik Rubenfeld, Programmer



[Return to 2D Arrays Main]



Archived Thread:

Date: Tue, 27 Apr 2004 10:36
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>
To: James Chandler Jr <jchandjr@bellsouth.net>


Cool. Thanks for the info.

-Vik

On 4/27/04 10:20 AM, "James Chandler Jr" <jchandjr@bellsouth.net> wrote:

> Hi Vik
>
> Just an idea...
>
> If this is gonna be OSX-Only, it may be as-good-or-better to use a pointer
> rather than handle.
>
> This would save one level of indirection.
>
> From lurking the Carbon mailing list, apparently Pointers are preferred over
> Handles in OSX, due to the improved memory implementation in OSX.
>
> If I'm remembering the discussions correctly, they said that Handles are just
> 'kludged' on-top of the improved Pointers, and Handles no longer offer
> memory-management advantages as they did in Classic MacOS.
>
> JCJR



Date: Tue, 27 Apr 2004 10:41
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>
To: Steve Bird <SBird@Culverson.com>


> Not one whit, unless you're trying to match a pre-existing structure.
>
> If YOU are the only one who writes it, and YOU are the only one who
> reads it, then you can do it either way, as long as you are consistent.
>
> MyArrayHandle^^[3,17] will be the same string whether you call 3 the
> row or you call 17 the row.

True. However, say we create the handle like this:

HandleToLarge2DArray := NewHandle(RecordSize * 100000 * 2);

I would guess that one of the following is going to be outside the block of ram allocated -- either:

HandleToLarge2DArray^^[2, 100000]

or

HandleToLarge2DArray^^[100000, 2]

Is that not so?


-Vik


Date: Tue, 27 Apr 2004 11:18
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 4/27/04 10:54 AM, "Steve Bird" <SBird@Culverson.com> wrote:

>> I would guess that one of the following is going to be outside the
>> block of
>> ram allocated -- either:
>>
>> HandleToLarge2DArray^^[2, 100000]
>>
>> or
>>
>> HandleToLarge2DArray^^[100000, 2]
>>
>> Is that not so?
>
> --- True.
>
> I'm not sure, but I think the Pascal standard says that arrays can be
> stored column major, or row-major order, and that it's
> implementation-specific (i.e. there is no standard).

Interesting!

> Perhaps your best bet would be to try your own examples above. The one
> that doesn't crash is the one to use...

I guess I can...

- Get the address of the first byte in ram pointed to by the handle (or pointer, given the info posted by James).

- Calculate the address of the last byte in ram of the data belonging to the pointer.

- Determine the addresses of array references of the kinds mentioned above.

- ...and see which one's out of range.

I'll report back to the list what I find out.

--
Date: Tue, 27 Apr 2004 12:4
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 4/27/04 10:54 AM, "Joe van Tunen" <jvantunen@chancery.com> wrote:

> The type definition says that there is always MaxLongInt
> * SizeOf( String[ maxFieldWidth ] ) bytes between each row which is impossible
> (that's more memory than can fit in a computer).

On 4/27/04 11:28 AM, "Steve Bird" <SBird@Culverson.com> wrote:

> You ain't got that much memory.

True. That's the benefit in this case of allocating the ram as needed. The type thinks it has that much memory, but the actual pointer points to only the amount of ram that is required at the time. I've done this extensively with 1D arrays.

-Vik


Date: Tue, 27 Apr 2004 12:35
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 4/27/04 12:29 PM, "Joe van Tunen" <jvantunen@chancery.com> wrote:

> The problem is that when you try to use HandleToLarge2DArray^^[aRow, aColumn]
> the compiler is doing this math: row * MaxLongInt * RecordSize no matter what
> you say NumberOfColumnsNeeded is.
>
> The first row will be at offset 0 from the start of the memory of the 2D
> array. The second row will always be at offset 2147483647 * RecordSize. The
> third row will always be at offset 2 * 2147483647 * RecordSize. etc.
>
> Anything after the first row will not be accessible. That is why you should
> not define a 2D array type if it is dynamic.
>
> Define a 1D array and use a routine like in my example:
>
>
> FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
> BEGIN
> GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column
> ];
> END;

Excellent. Thanks very much, Joe, and thanks also to Steve and to others who posted on this.

-Vik



Date: Tue, 27 Apr 2004 20:2
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


> I found that arrays were one of the things which I made a lot of
> programming errors with, so I wrote a library to handle it for me. If
> anyone is interested I can upload it to Pascal Central.

Definitely. Please upload it. I look forward to checking it out.

-Vik


Date: Wed, 28 Apr 2004 09:49
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 4/28/04 3:44 AM, "Gale Paeper" <gpaeper@empirenet.com> wrote:

> TYPE
> T2DStrArray(M: TIndex1; N: TIndex2; StrCapacity: TStrCapacity) =
> ARRAY[1..M,1..N,0..StrCapacity] OF UNSIGNEDBYTE;

I didn't even know you could do that. Thank you, Gale.

The thread now identifies 3 ways of producing dynamically-allocated 2D arrays:

- The Schema method, as described by Gale.
- The one-pointer method, as described by Steve and Joe.
- The array of pointers to arrays method, as described by James.

I expect this thread will be referenced in the future by coders seeking to work with 2D arrays. In fact, the thread could be posted to Pascal Central as a how-to article.


-Vik


Date: Mon, 3 May 2004 09:11
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 5/3/04 1:35 AM, "Peter N Lewis" <peter@stairways.com.au> wrote:

>> I'm thinking of doing something along these lines to create a dynamically
>> allocated 2D array:
>>
>>
>> Type
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>> str[MaxFieldWidth];
>
> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]
>
> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it? Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.
>
> Also important is the speed requirements for your access. Will you
> be accessing the elements frequently. Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window). Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem? Are you going to be changing the strings or once set
> are they constant?
>
> Personally, I would consider doing something like this:
>
> const
> MaxFieldWidth = whatever (make sure it is odd!)
> type
> MyField = string[MaxFieldWidth]
> MyFields = array[0..MaxLongInt div SizeOf(MyField)] of MyField;
> MyFieldsPtr = ^MyFields;
> MyStructure = record
> rows: longint;
> cols: longint;
> fields: MyFieldsPtr;
> end;
>
> function CreateMyStructure( var s: MyStructure; r, c : longint ): OSStatus;
> procedure DestroyMyStructure( var s: MyStructure );
> function MyStructureToIndex( const s: MyStructure; r, c: longint ): longint;
> function MyStructureGet( const s: MyStructure; r, c: longint ): MyField;
> function MyStructureSet( const s: MyStructure; r, c: longint; const
> f: MyField );
>
> CreateMyStructure would allocate space in fields for r*c fields.
> DestroyMyStructure would dispose the space
> MyStructureToIndex would return (r-1)*rows+c-1
> MyStructureGet and MyStructureSet would use MyStructureToIndex and
> assign or read the field.
>
> Implementation is left up to the reader.
>
> If the array is sparse (ie, you wont be using most of it), instead of
> MyField being a string, make it a pointer to a string, and create
> them as needed.
>
> If the strings vary in size a lot, do the above and also create the
> strings to the appropriate size.
>
> Enjoy,
> Peter.


HandleToPTDocStyleRunRecordArray = ^PtrToPTDocStyleRunRecordArray;
PtrToPTDocStyleRunRecordArray = ^PTDocStyleRunRecordArray;
PTDocStyleRunRecordArray = array[1..MaxLongInt] of PTDocStyleRunRecord;

{style record- records all formatting for a given style run, or set of style runs having the same style}
PTDocStyleRecord = record
TotalTabs: integer;
TabLocations: PTDocTabLocations;
CurrentPTDocStyle: PTDocStyles;
CurrentFontID: integer;
CurrentFontSize: integer;
CurrentColor: GRTextColorTypes;
CurrentAdjustSetting: PTDocAdjustSettings;
CurrentLeftIndentInPixels: integer;
VerticalLinesAreOn: boolean;
NumberOfVerticalLines: integer;
HlocsOfVerticalLines: PTDocHLocsOfVerticalLinesArray;
TabLocationsCheckSum: LongInt;
end;



Date: Mon, 3 May 2004 09:11
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


On 5/3/04 1:35 AM, "Peter N Lewis" <peter@stairways.com.au> wrote:

>> I'm thinking of doing something along these lines to create a dynamically
>> allocated 2D array:
>>
>>
>> Type
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>> str[MaxFieldWidth];
>
> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]
>
> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it? Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.
>
> Also important is the speed requirements for your access. Will you
> be accessing the elements frequently. Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window). Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem? Are you going to be changing the strings or once set
> are they constant?
>
> Personally, I would consider doing something like this:
>
> const
> MaxFieldWidth = whatever (make sure it is odd!)
> type
> MyField = string[MaxFieldWidth]
> MyFields = array[0..MaxLongInt div SizeOf(MyField)] of MyField;
> MyFieldsPtr = ^MyFields;
> MyStructure = record
> rows: longint;
> cols: longint;
> fields: MyFieldsPtr;
> end;
>
> function CreateMyStructure( var s: MyStructure; r, c : longint ): OSStatus;
> procedure DestroyMyStructure( var s: MyStructure );
> function MyStructureToIndex( const s: MyStructure; r, c: longint ): longint;
> function MyStructureGet( const s: MyStructure; r, c: longint ): MyField;
> function MyStructureSet( const s: MyStructure; r, c: longint; const
> f: MyField );
>
> CreateMyStructure would allocate space in fields for r*c fields.
> DestroyMyStructure would dispose the space
> MyStructureToIndex would return (r-1)*rows+c-1
> MyStructureGet and MyStructureSet would use MyStructureToIndex and
> assign or read the field.
>
> Implementation is left up to the reader.
>
> If the array is sparse (ie, you wont be using most of it), instead of
> MyField being a string, make it a pointer to a string, and create
> them as needed.
>
> If the strings vary in size a lot, do the above and also create the
> strings to the appropriate size.
>
> Enjoy,
> Peter.



Date: Mon, 3 May 2004 09:23
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]

I do something that appears to be along those lines:

type PTDocStyleRunRecordArray = array[1..MaxLongInt] of PTDocStyleRunRecord;

...as in:

{style record- records all formatting for a given style run, or set of style runs having the same style}
PTDocStyleRecord = record
TotalTabs: integer;
TabLocations: PTDocTabLocations;
CurrentPTDocStyle: PTDocStyles;
CurrentFontID: integer;
CurrentFontSize: integer;
CurrentColor: GRTextColorTypes;
CurrentAdjustSetting: PTDocAdjustSettings;
CurrentLeftIndentInPixels: integer;
VerticalLinesAreOn: boolean;
NumberOfVerticalLines: integer;
HlocsOfVerticalLines: PTDocHLocsOfVerticalLinesArray;
TabLocationsCheckSum: LongInt;
end;

HandleToPTDocStyleRunRecordArray = ^PtrToPTDocStyleRunRecordArray;
PtrToPTDocStyleRunRecordArray = ^PTDocStyleRunRecordArray;
PTDocStyleRunRecordArray = array[1..MaxLongInt] of PTDocStyleRunRecord;

...and it's always worked well.

> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it?

I will use all of the array, up to the total number of rows and columns allocated when the array is created.

> Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.

Typically smaller. MaxFieldWidth is a constant.

> Also important is the speed requirements for your access.

Speed of access is very important.

> Will you
> be accessing the elements frequently.

Yes.

> Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window).

Speed.

> Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem?

MaxFieldWidth has never exceeded 255 characters.

> Are you going to be changing the strings or once set
> are they constant?

There is a need to change them.

> Personally, I would consider doing something like this:

[....]

I hadn't considered an array of pointers to strings, which conserves memory, and even permits strings greater than MaxFieldWidth if necessary.

Is there a speed hit from accessing the strings via pointers?

Thanks very much for this excellent advice.

-Vik


Date: Mon, 3 May 2004 23:49
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


> It probably is fine, most of the time. But
> SizeOF(PTDocStyleRunRecordArray) is likely a bogus number, quite
> probably MacLongInt*SizeOf(PTDocStyleRunRecord) mod 2^32, which could
> be a negative number or indeed pretty much any number.

Very informative. Thank you, Peter.

> Hmm, looking back at your original query, you've got:
>
> NumberOfRowsNeeded:= 75000;
> NumberOfColumnsNeeded:= 2000;

It's for a project that is currently obtaining data at a rate of about 5000 records (i.e. rows) per year. So in five years (planning for the future) there could reasonably be 25000 rows. There are currently 1850 fields per record, so let's call that 2000 columns.

Following your post this morning, I'm currently looking into the possibility of using pointers so as to reduce the amount of memory used by the data itself. The data itself will probably occupy about 24 megabytes per year, based on the amount of data obtained for the first 3 months of the project. (Why only 24 megs of actual data allocated across 10 million fields? The answer is, most fields contain very little data, and there are a large percentage of empty fields as well.)

So in 5 years there may be 25000 * 2000 = 50 million pointers, pointing to 120 megs of data.

50 million pointers * 4 bytes each = 200 million bytes of ram, or a bit under 200 megabytes, just for the pointers. 200 megs plus the 120 megs of actual data is around 320 megs of ram, which is not excessive.

I wonder if OS-X can deal with 50 million pointers.

Of course, there's the question of what sort of data structures to use to keep track of 50 million pointers. :) I've been writing code for that today, based on the excellent advice I have received on this subject from this list.

Does this appear to make the approach I'm considering more practical?


-Vik


Date: Tue, 4 May 2004 01:38
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
To: macpascal <macpascal@avalon.pascal-central.com>


> Ahh, ok, that was what I was asking when I mentioned "Will you
> actually use all of the array, or only parts of it?". If many fields
> are empty, then you are not using the whole array (that is then
> called a sparse array).

I started considering using sparse arrays after reading your posts on this thread. :)

> I would find a way to avoid using system pointers. The reason is, a
> system pointer does not in fact take up 4 bytes. Since a pointer
> knows its size (GetPtrSize requires it) it must take up at least 8
> bytes. And it might well take up more than that with other
> housekeeping stuff, perhaps 16 bytes or more.

I had no idea.

> Instead of using real pointers, what you probably will want to do is
> write your own memory manager designed specifically for this task
> using large blocks of memory allocated from the system.

Excellent.

> If you never had to dispose any fields (ie, you never changed fields
> - or at least never resized them), then all you would need is a
> single index into this block and each time you allocate a new field,
> the pointer is @fields[index] and then you increment index by the
> size of the required field.

I don't need to change the contents of the fields. This makes it much easier per your description.

This is great info. I will be implementing all this tomorrow. Thanks very much, Peter.

-Vik


Date: Tue, 27 Apr 2004 10:06:38 -0700
Subject: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


I'm thinking of doing something along these lines to create a dynamically
allocated 2D array:


Type
PtrToLarge2DArray = ^Large2DArray;
HandleToLarge2DArray =^PtrToLarge2DArray;
Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
str[MaxFieldWidth];

Var
RecordSize, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;

{...}
Begin
{determine array size on the fly at runtime}
NumberOfRowsNeeded:= 75000;
NumberOfColumnsNeeded:= 2000;

RecordSize:= MaxFieldWidth + 1; {+1 for the length byte of a Pascal
string}
HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded *
NumberOfColumnsNeeded);

FailNil(handle(HandleToLarge2DArray));
MoveHi(handle(HandleToLarge2DArray));
Hlock(handle(HandleToLarge2DArray));

{...}
End;


Now here's the interesting part. To get this to map correctly to the ram
allocated by the call to NewHandle, do I access this array like this:

HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';

Or like this:

HandleToLarge2DArray^^[aColumn, aRow]:= 'a string';


Thanks very much in advance to all for any info.

-Vik



From: "James Chandler Jr" <jchandjr@bellsouth.net>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 13:20:30 -0400


Hi Vik

Just an idea...

If this is gonna be OSX-Only, it may be as-good-or-better to use a pointer
rather than handle.

This would save one level of indirection.

From lurking the Carbon mailing list, apparently Pointers are preferred over
Handles in OSX, due to the improved memory implementation in OSX.

If I'm remembering the discussions correctly, they said that Handles are just
'kludged' on-top of the improved Pointers, and Handles no longer offer
memory-management advantages as they did in Classic MacOS.

JCJR


From: Steve Bird <SBird@Culverson.com>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 13:20:23 -0400


On Apr 27, 2004, at 1:06 PM, Vik Rubenfeld wrote:

[snip]

> HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded *
> NumberOfColumnsNeeded);
>
> FailNil(handle(HandleToLarge2DArray));
> MoveHi(handle(HandleToLarge2DArray));
> Hlock(handle(HandleToLarge2DArray));
>
> {...}
> End;
>
>
> Now here's the interesting part. To get this to map correctly to the
> ram
> allocated by the call to NewHandle, do I access this array like this:
>
> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';
>
> Or like this:
>
> HandleToLarge2DArray^^[aColumn, aRow]:= 'a string';

- --- It doesn't matter.

Not one whit, unless you're trying to match a pre-existing structure.

If YOU are the only one who writes it, and YOU are the only one who
reads it, then you can do it either way, as long as you are consistent.

MyArrayHandle^^[3,17] will be the same string whether you call 3 the
row or you call 17 the row.



Date: Tue, 27 Apr 2004 10:41:34 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


> Not one whit, unless you're trying to match a pre-existing structure.
>
> If YOU are the only one who writes it, and YOU are the only one who
> reads it, then you can do it either way, as long as you are consistent.
>
> MyArrayHandle^^[3,17] will be the same string whether you call 3 the
> row or you call 17 the row.

True. However, say we create the handle like this:

HandleToLarge2DArray := NewHandle(RecordSize * 100000 * 2);

I would guess that one of the following is going to be outside the block of
ram allocated -- either:

HandleToLarge2DArray^^[2, 100000]

or

HandleToLarge2DArray^^[100000, 2]

Is that not so?


-Vik


Date: Tue, 27 Apr 2004 10:36:11 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


Cool. Thanks for the info.

-Vik

On 4/27/04 10:20 AM, "James Chandler Jr" <jchandjr@bellsouth.net> wrote:

> Hi Vik
>
> Just an idea...
>
> If this is gonna be OSX-Only, it may be as-good-or-better to use a pointer
> rather than handle.
>
> This would save one level of indirection.
>
> From lurking the Carbon mailing list, apparently Pointers are preferred over
> Handles in OSX, due to the improved memory implementation in OSX.
>
> If I'm remembering the discussions correctly, they said that Handles are just
> 'kludged' on-top of the improved Pointers, and Handles no longer offer
> memory-management advantages as they did in Classic MacOS.
>
> JCJR



Date: Tue, 27 Apr 2004 10:50:19 -0700
From: Richard Palais <palais@brandeis.edu>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


Dear Vik,

You made the type definitions,

> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
> str[MaxFieldWidth];


and asked:

> Now here's the interesting part. To get this to map correctly to the
> ram allocated by the call to NewHandle, do I access this array like
> this:
>
> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';
>
> Or like this:
>
> HandleToLarge2DArray^^[aColumn, aRow]:= 'a string';

Is there some reason not to instead make the definition:

Large2DArray = array[1..MaxLongInt, 1..MaxLongInt] of
str[MaxFieldWidth];

in which case the correct way to access it is clear. (Of course, if you
prefer the other way, then I believe that the answer is still
([aRow,aColumn]), but a little experimentation will tell for sure.)

Dick Palais


> I'm thinking of doing something along these lines to create a
> dynamically allocated 2D array:

> Type
> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
> str[MaxFieldWidth];
> Var
> RecordSize, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;
> {...}
> Begin
> {determine array size on the fly at runtime}
> NumberOfRowsNeeded:= 75000;
> NumberOfColumnsNeeded:= 2000;
>
> RecordSize:= MaxFieldWidth + 1; {+1 for the length byte of a Pascal
> string}
> HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded *
> NumberOfColumnsNeeded);
>
> FailNil(handle(HandleToLarge2DArray));
> MoveHi(handle(HandleToLarge2DArray));
> Hlock(handle(HandleToLarge2DArray));
>
> {...}
> End;
>
>
> Now here's the interesting part. To get this to map correctly to the
> ram allocated by the call to NewHandle, do I access this array like
> this:
>
> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';
>
> Or like this:
>
> HandleToLarge2DArray^^[aColumn, aRow]:= 'a string';
>
>
> Thanks very much in advance to all for any info.
>
> -Vik
>
>

Richard S. Palais


Subject: RE: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 10:54:55 -0700
From: "Joe van Tunen" <jvantunen@chancery.com>


If you make the first number the row and the second number the column (HandleToLarge2DArray^^[aRow, aColumn]), then the elements in a table like this:

column
1 2 3
row |--------
1 | A B C
2 | D E F

will be arranged in memory like this:

A B C D E F


- Your definition of Large2DArray cannot have a variable number of columns (the second number). The type definition says that there is always MaxLongInt * SizeOf( String[ maxFieldWidth ] ) bytes between each row which is impossible (that's more memory than can fit in a computer).

- You get better performance if your index starts from zero.

- Using MaxLongInt makes it dificult to view this type in the CodeWarrior debugger. To allow for an arbitrary sized array that can be bigger than 32k bytes in total size, you only need to make the array size one byte bigger than 32k to force the compiler to use 32 bit arithmetic when calculating the address of an element. If you are using the Code Warrior Pascal range checking option, then you need to make the number of array elements in the definition at least as big as the maximum number of elements you expect to use in the real world (probably less than MaxLongInt).


TYPE
RecordType = String[MaxFieldWidth];
RecordTypePtr = ^RecordType;
CONST
RecordSize = SizeOf(RecordType);
TYPE
LargeArray = array[0..32768 DIV RecordSize] OF RecordType;
PtrToLargeArray = ^LargeArray;
HandleToLargeArray = ^PtrToLargeArray;

VAR
HandleToLarge2DArray : HandleToLargeArray;

FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
BEGIN
GetRecordPtr := HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column ] );
END;



> ----------
> From: Vik Rubenfeld
> Reply To: Subscribers of macpascal
> Sent: Tuesday, April 27, 2004 10:06 AM
> To: Subscribers of macpascal
> Subject: Accessing Records in Dynamically Allocated 2D Arrays?
>
> I'm thinking of doing something along these lines to create a dynamically
> allocated 2D array:
>
>
> Type
> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
> str[MaxFieldWidth];
>
> Var
> RecordSize, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;
>
> {...}
> Begin
> {determine array size on the fly at runtime}
> NumberOfRowsNeeded:= 75000;
> NumberOfColumnsNeeded:= 2000;
>
> RecordSize:= MaxFieldWidth + 1; {+1 for the length byte of a Pascal
> string}
> HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded *
> NumberOfColumnsNeeded);
>
> FailNil(handle(HandleToLarge2DArray));
> MoveHi(handle(HandleToLarge2DArray));
> Hlock(handle(HandleToLarge2DArray));
>
> {...}
> End;
>
>
> Now here's the interesting part. To get this to map correctly to the ram
> allocated by the call to NewHandle, do I access this array like this:
>
> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';
>
> Or like this:
>
> HandleToLarge2DArray^^[aColumn, aRow]:= 'a string';
>
>
> Thanks very much in advance to all for any info.
>
> -Vik
>
>
> --
>
> Pascal Central............ http://pascal-central.com/
> MacPascal List Prefs...... http://pascal-central.com/maillist.html
> MacPascal List Archives... http://macpascal:blaise@excaliburworld.net/Login
>
>


Subject: RE: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 10:59:54 -0700
From: "Joe van Tunen" <jvantunen@chancery.com>


Oops. That should be:

> GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column ];
>
> ----------
> From: Joe van Tunen
> Reply To: Subscribers of macpascal
> Sent: Tuesday, April 27, 2004 10:54 AM
> To: Subscribers of macpascal
> Subject: RE: Accessing Records in Dynamically Allocated 2D Arrays?
>
>
> FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
> BEGIN
> GetRecordPtr := HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column ] );
> END;
>
>
>


Date: Tue, 27 Apr 2004 11:00:12 -0700
From: Richard Palais <palais@brandeis.edu>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


> True. However, say we create the handle like this:
> HandleToLarge2DArray := NewHandle(RecordSize * 100000 * 2);
>
> I would guess that one of the following is going to be outside the
> block of ram allocated -- either:
>
> HandleToLarge2DArray^^[2, 100000]
>
> or
>
> HandleToLarge2DArray^^[100000, 2]
>
> Is that not so?

I don't think so. I believe the block is just one contiguous linear
array of bytes in memory. When you access a particular array element,
then the compiler adds a little code to translate between the 2D array
and the 1D array.

Dick


Date: Tue, 27 Apr 2004 11:18:05 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 4/27/04 10:54 AM, "Steve Bird" <SBird@Culverson.com> wrote:

>> I would guess that one of the following is going to be outside the
>> block of
>> ram allocated -- either:
>>
>> HandleToLarge2DArray^^[2, 100000]
>>
>> or
>>
>> HandleToLarge2DArray^^[100000, 2]
>>
>> Is that not so?
>
> --- True.
>
> I'm not sure, but I think the Pascal standard says that arrays can be
> stored column major, or row-major order, and that it's
> implementation-specific (i.e. there is no standard).

Interesting!

> Perhaps your best bet would be to try your own examples above. The one
> that doesn't crash is the one to use...

I guess I can...

- Get the address of the first byte in ram pointed to by the handle (or
pointer, given the info posted by James).

- Calculate the address of the last byte in ram of the data belonging to the
pointer.

- Determine the addresses of array references of the kinds mentioned above.

- ...and see which one's out of range.

I'll report back to the list what I find out.



From: Steve Bird <SBird@Culverson.com>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 14:28:13 -0400


On Apr 27, 2004, at 1:41 PM, Vik Rubenfeld wrote:

> However, say we create the handle like this:
>
> HandleToLarge2DArray := NewHandle(RecordSize * 100000 * 2);
>
> I would guess that one of the following is going to be outside the
> block of
> ram allocated -- either:
>
> HandleToLarge2DArray^^[2, 100000]
>
> or
>
> HandleToLarge2DArray^^[100000, 2]
>
> Is that not so?

- --- Wait a minute. Going back to your original post:
> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
> str[MaxFieldWidth];

You've defined the contents of the handle as an array[1..MaxLongInt] of
array[1..MaxLongInt] of strings.

The compiler should NOT let you get away with that, but even if it
does, you can't then treat it as an array[2,100000] or an array
[100000,2] either one, because you've told the compiler that it's
MAXLONGINT x MAXLONGINT in size.

That means the 0th row (column) is at offset 0, but the 1st row(column)
is at MAXLONGINT x sizeof(element), and the 2nd row(column) is at
MAXLONGINT x 2 * sizeof(element). You ain't got that much memory.



Date: Tue, 27 Apr 2004 14:42:07 -0400
From: Erik Jensen <ejensen@unbc.ca>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


>On Apr 27, 2004, at 1:41 PM, Vik Rubenfeld wrote:
>
>> However, say we create the handle like this:
>>
>>HandleToLarge2DArray := NewHandle(RecordSize * 100000 * 2);
>>
>>I would guess that one of the following is going to be outside the block of
>>ram allocated -- either:
>>
>>HandleToLarge2DArray^^[2, 100000]
>>
>>or
>>
>>HandleToLarge2DArray^^[100000, 2]
>>
>>Is that not so?
>
>--- Wait a minute. Going back to your original post:
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>>str[MaxFieldWidth];
>
>You've defined the contents of the handle as an array[1..MaxLongInt]
>of array[1..MaxLongInt] of strings.
>
>The compiler should NOT let you get away with that, but even if it
>does, you can't then treat it as an array[2,100000] or an array
>[100000,2] either one, because you've told the compiler that it's
>MAXLONGINT x MAXLONGINT in size.
>
>That means the 0th row (column) is at offset 0, but the 1st
>row(column) is at MAXLONGINT x sizeof(element), and the 2nd
>row(column) is at MAXLONGINT x 2 * sizeof(element). You ain't got
>that much memory.
>

Hi,

Something akin to this is dimly in my memory. I recall having to work
with this in an old application where the 2-D array size had to be
determined at run-time so all my 2-D arrays were of the form
myStorage[0..0] of whatever;

and allocated as pointers or handles. I then had to do the arithmetic
to access point(x,y) each time. As I recall, using [0..0] arrays
meant that there was no range checking. I also recall having to be
careful when the array size exceeded the range of a 16 bit signed
integer, so I had to do the 32 bit math carefully to avoid the
compiler giving a negative offset. I am away from home at the moment
so the relevant files aren't on my laptop here.

Erik.


Date: Tue, 27 Apr 2004 12:04:56 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 4/27/04 10:54 AM, "Joe van Tunen" <jvantunen@chancery.com> wrote:

> The type definition says that there is always MaxLongInt
> * SizeOf( String[ maxFieldWidth ] ) bytes between each row which is impossible
> (that's more memory than can fit in a computer).

On 4/27/04 11:28 AM, "Steve Bird" <SBird@Culverson.com> wrote:

> You ain't got that much memory.

True. That's the benefit in this case of allocating the ram as needed. The
type thinks it has that much memory, but the actual pointer points to only
the amount of ram that is required at the time. I've done this extensively
with 1D arrays.

-Vik



From: Steve Bird <SBird@Culverson.com>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 15:24:42 -0400


On Apr 27, 2004, at 3:04 PM, Vik Rubenfeld wrote:

> On 4/27/04 10:54 AM, "Joe van Tunen" <jvantunen@chancery.com> wrote:
>
>> The type definition says that there is always MaxLongInt
>> * SizeOf( String[ maxFieldWidth ] ) bytes between each row which is
>> impossible
>> (that's more memory than can fit in a computer).
>
> On 4/27/04 11:28 AM, "Steve Bird" <SBird@Culverson.com> wrote:
>
>> You ain't got that much memory.
>
> True. That's the benefit in this case of allocating the ram as needed.
> The
> type thinks it has that much memory, but the actual pointer points to
> only
> the amount of ram that is required at the time. I've done this
> extensively
> with 1D arrays.

- --- But you can't impose the TYPE declaration you made on it.
Each row is MAXLONGINT elements long (that's what you told the
compiler).
That, by itself, means that even one row occupies more memory than you
have.

Someone mentioned doing your own arithmetic - I would vote for that.

Make it an object.

Initialize it with NRows, NColumns.
The INIT method would calculate how much RAM to allocate.
A READ (Row, Column) method would calculate HandleAddress + (Row *
#Columns + Column) * Size of (Element) and retrieve it for you.
A WRITE (Row, Column) method would calculate the same address and
store it for you.
The destructor would deallocate the handle.


Subject: RE: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Tue, 27 Apr 2004 12:29:18 -0700
From: "Joe van Tunen" <jvantunen@chancery.com>


The problem is that when you try to use HandleToLarge2DArray^^[aRow, aColumn] the compiler is doing this math: row * MaxLongInt * RecordSize no matter what you say NumberOfColumnsNeeded is.

The first row will be at offset 0 from the start of the memory of the 2D array.
The second row will always be at offset 2147483647 * RecordSize.
The third row will always be at offset 2 * 2147483647 * RecordSize.
etc.

Anything after the first row will not be accessible. That is why you should not define a 2D array type if it is dynamic.

Define a 1D array and use a routine like in my example:


FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
BEGIN
GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column ];
END;


> ----------
> From: Vik Rubenfeld
> Reply To: Subscribers of macpascal
> Sent: Tuesday, April 27, 2004 12:04 PM
> To: Subscribers of macpascal
> Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
>
> On 4/27/04 10:54 AM, "Joe van Tunen" <jvantunen@chancery.com> wrote:
>
> > The type definition says that there is always MaxLongInt
> > * SizeOf( String[ maxFieldWidth ] ) bytes between each row which is impossible
> > (that's more memory than can fit in a computer).
>
> On 4/27/04 11:28 AM, "Steve Bird" <SBird@Culverson.com> wrote:
>
> > You ain't got that much memory.
>
> True. That's the benefit in this case of allocating the ram as needed. The
> type thinks it has that much memory, but the actual pointer points to only
> the amount of ram that is required at the time. I've done this extensively
> with 1D arrays.
>
> -Vik
>


Date: Tue, 27 Apr 2004 12:35:01 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 4/27/04 12:29 PM, "Joe van Tunen" <jvantunen@chancery.com> wrote:

> The problem is that when you try to use HandleToLarge2DArray^^[aRow, aColumn]
> the compiler is doing this math: row * MaxLongInt * RecordSize no matter what
> you say NumberOfColumnsNeeded is.
>
> The first row will be at offset 0 from the start of the memory of the 2D
> array. The second row will always be at offset 2147483647 * RecordSize. The
> third row will always be at offset 2 * 2147483647 * RecordSize. etc.
>
> Anything after the first row will not be accessible. That is why you should
> not define a 2D array type if it is dynamic.
>
> Define a 1D array and use a routine like in my example:
>
>
> FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
> BEGIN
> GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded + column
> ];
> END;

Excellent. Thanks very much, Joe, and thanks also to Steve and to others who
posted on this.

-Vik



From: James Derrick <jamesderrick@pobox.com>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Wed, 28 Apr 2004 09:11:33 +0800


> Anything after the first row will not be accessible. That is why you
> should not define a 2D array type if it is dynamic.
>
> Define a 1D array and use a routine like in my example:
>
>
> FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
> BEGIN
> GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded
> + column ];
> END;

Why not do something like

type

OneDArray = array [1..MAXINT+1] of Str255;
OneDArrayPtr = ^OneDArray;
OneDArrayHand = ^OneDArrayPtr;

TwoDArray = array[1..MAXINT+1] of OneDArrayHand;

etc

If you implement them as objects, you can make the interface pretty
simple.

I found that arrays were one of the things which I made a lot of
programming errors with, so I wrote a library to handle it for me. If
anyone is interested I can upload it to Pascal Central.

James Derrick



Date: Tue, 27 Apr 2004 20:02:27 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


> I found that arrays were one of the things which I made a lot of
> programming errors with, so I wrote a library to handle it for me. If
> anyone is interested I can upload it to Pascal Central.

Definitely. Please upload it. I look forward to checking it out.

-Vik


Date: Wed, 28 Apr 2004 03:44:43 -0700
From: Gale Paeper <gpaeper@empirenet.com>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


Vik Rubenfeld wrote:
>
> I'm thinking of doing something along these lines to create a dynamically
> allocated 2D array:
>
> Type
> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
> str[MaxFieldWidth];
>
> Var
> RecordSize, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;
>
> {...}
> Begin
> {determine array size on the fly at runtime}
> NumberOfRowsNeeded:= 75000;
> NumberOfColumnsNeeded:= 2000;
>
> RecordSize:= MaxFieldWidth + 1; {+1 for the length byte of a Pascal
> string}
> HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded *
> NumberOfColumnsNeeded);
>
> FailNil(handle(HandleToLarge2DArray));
> MoveHi(handle(HandleToLarge2DArray));
> Hlock(handle(HandleToLarge2DArray));
>
> {...}
> End;
>
> Now here's the interesting part. To get this to map correctly to the ram
> allocated by the call to NewHandle, do I access this array like this:
>
> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';

That isn't going to work. You have too many dynamically sized
components for the static, compile-time sized type. For a static type,
you can have at most a dynamically sized first dimension and still have
the compiler generated array indexed addressing work correctly. (The
compiler doesn't know the other dimensions are varying in size at
run-time so the addressing used is for the type's static, compile-time
size for the additional components beyond the first array dimension.)

If you want dynamic sizing, schema types is just what you need since the
compiler uses the dynamic, run-time established sizes of the components
in calculating component addresses.

This would be much easier with GPC with its direct support for Extended
Pascal's string schema type and its more comprehensive support (also
without the bonus of CW bugs) for Extended Pascal's schema type
features. It can still be made to work with CW Pascal's limited schema
types support with some type coersion hacking and programmer discipline
to ensure the code always stays within the bounds of the dynamically
sized string allocated storage.

The type coersion hacking is needed to work around the inherent static,
compile-time sizing for string type declarations. For run-time varing
sized string storage, some other type declaration must be used which CW
Pascal will accept for a dynamically sized schema type declaration. The
critical part of the alternative, other type declaration which is used
in allocating the storage for the run-time capacity sized string is
getting the correct memory size footprint.

The following short program shows how to do this with CW Pascal. (I've
kept the sizes small for the array dimensions to keep the dianostic
output at a reasonable size.) The dummy in the new call is a CW Pascal
bug workaround. (CAUTION: If you change the T2DStrArray schema type
declaration I used, be sure to check that the new code sets the actual
schema discriminates correctly. I had originally started out with a
slightly different type declations for T2DStrArray but encountered
another bug in CW Pascal's implementation which ended up setting some
the schema discriminate fields to bad values. Thus, the present
T2DStrArray type declaration is partially a result of a bug workaround.)

The type coersion hack with the pointer gymnastics is pretty ugly but
that's what it took to get the compiler to accept type coersion for a
string type to and from the T2DStrArray schema type.

Also, be sure to have the "Copy Strings Using Length byte" option
selected in the project target Pascal Language preference panel. With
that option set, the compiler generates code to use the dynamic,
run-time length of the string in copying the string bytes and thus keeps
the copying within the bounds of the dynamically allocated sting
storage. (Presuming, of course. that you haven't made any coding errors
in string sizing.)

Note: You don't necessarily have to use an intermediary string buffer
like I used in the program. Just keep in mind that you get no compiler
help in finding coding errors in the dynamically sized string storage
area and the more complex the type coersion hacking the more difficult
it is in figuring out what the root cause of the problem is if your code
has bugs in it related to dynamically sized string storage.

PROGRAM Schema2DArray;

TYPE
TIndex1 = 1..MaxLongInt;
TIndex2 = 1..MaxLongInt;
TStrCapacity = 1..255;
{Last index 0..StrCapacity is storage area for string[StrCapacity].
The 0 lower bound automatically adds in the string type hidden
length byte.}
T2DStrArray(M: TIndex1; N: TIndex2; StrCapacity: TStrCapacity) =
ARRAY[1..M,1..N,0..StrCapacity] OF UNSIGNEDBYTE;
T2DStrArrayPtr = ^T2DStrArray;
Str255 = string[255];
Str255Ptr = ^Str255;

VAR
My2DStrArrayPtr : T2DStrArrayPtr;
StrBuffer: Str255;
dummy: INTEGER;
i, j: LONGINT;
MaxFieldWidth, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;

BEGIN
{MaxFieldWidth, NumberOfRowsNeeded, and NumberOfColumnsNeeded can be
calculated at run-time.}
NumberOfRowsNeeded:= 5;
NumberOfColumnsNeeded:= 10;
MaxFieldWidth := 64; {with 0 lower bound this will allocate storage for string[64]}

new(My2DStrArrayPtr,dummy,NumberOfRowsNeeded,NumberOfColumnsNeeded,MaxFieldWidth);
FOR i := 1 TO My2DStrArrayPtr^.M DO
FOR j := 1 TO My2DStrArrayPtr^.N DO
BEGIN
StrBuffer := StringOf('Line ', i:1, ',', j:1);
Str255Ptr(@My2DStrArrayPtr^[i,j,0])^ := StrBuffer;
END;
FOR i := 1 TO My2DStrArrayPtr^.M DO
FOR j := 1 TO My2DStrArrayPtr^.N DO
BEGIN
StrBuffer := Str255Ptr(@My2DStrArrayPtr^[i,j,0])^;
writeln(StrBuffer:length(StrBuffer));
END;

END.

Gale Paeper
gpaeper@empirenet.com


From: "Joseph Kirby" <joseph.kirby@leros.net>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Wed, 28 Apr 2004 09:19:26 -0400


Just in case somone was about to do this...

1st Don't forget to add error checking. As in is my row / column within
bounds. Now having worked with vectors I would sugjest you resize your array
and copy the memory over if you get out of bounds but that's just me.

2nd if your going to be using and resizing larg chunks of memory I find it's
faster to have an array of pointers to blocks of memory vs just having it all
in one block of memory. It tends to fragment your memory over time but it also
handles memory fragmentations better.

On Tue, 27 Apr 2004 12:35:01 -0700
Vik Rubenfeld <vikr@mindspring.com> wrote:
> On 4/27/04 12:29 PM, "Joe van Tunen" <jvantunen@chancery.com> wrote:
>
> > The problem is that when you try to use HandleToLarge2DArray^^[aRow,
> aColumn]
> > the compiler is doing this math: row * MaxLongInt * RecordSize no matter
> what
> > you say NumberOfColumnsNeeded is.
> >
> > The first row will be at offset 0 from the start of the memory of the 2D
> > array. The second row will always be at offset 2147483647 * RecordSize. The
> > third row will always be at offset 2 * 2147483647 * RecordSize. etc.
> >
> > Anything after the first row will not be accessible. That is why you should
> > not define a 2D array type if it is dynamic.
> >
> > Define a 1D array and use a routine like in my example:
> >
> >
> > FUNCTION GetRecordPtr( column, row: LongInt ): RecordTypePtr;
> > BEGIN
> > GetRecordPtr := @HandleToLarge2DArray^^[ row * NumberOfColumnsNeeded +
> column
> > ];
> > END;
>
> Excellent. Thanks very much, Joe, and thanks also to Steve and to others who
> posted on this.
>
> -Vik
>


Date: Wed, 28 Apr 2004 16:51:36 +0200
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Adriaan van Os <macpascal@microbizz.nl>


Joseph Kirby wrote:

> 2nd if your going to be using and resizing larg chunks of memory I
> find it's
> faster to have an array of pointers to blocks of memory vs just having
> it all
> in one block of memory. It tends to fragment your memory over time
> but it also
> handles memory fragmentations better.

If you are really going to work with large chunks of memory, forget
about arrays. Dynamic data structures, like AVL-trees, are much
superior in performance. This is probably the reason why Wirth didn't
add dynamic arrays to Pascal as a langage, although they were in
Pascal's predecessor Algol (!)

Modern computing, with megabytes, gigabytes of available memory could
(and should) inspire a renewed interest in dynamic data structures. For
further reading, I recommend;

Niklaus Wirth, "Algorithms & Data Structures", Prentice-Hall 1986, ISBN
0-13-022005-1 01 (examples in Modula-2)
Dereck Wood, "Data Structures, Algorithms, and Performance", University
of Waterloo, Addison Wesley 1993, ISBN 0-201-52148-2 (examples in
Pascal)

Regards,

Adriaan van Os


Date: Wed, 28 Apr 2004 09:49:54 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 4/28/04 3:44 AM, "Gale Paeper" <gpaeper@empirenet.com> wrote:

> TYPE
> T2DStrArray(M: TIndex1; N: TIndex2; StrCapacity: TStrCapacity) =
> ARRAY[1..M,1..N,0..StrCapacity] OF UNSIGNEDBYTE;

I didn't even know you could do that. Thank you, Gale.

The thread now identifies 3 ways of producing dynamically-allocated 2D
arrays:

- The Schema method, as described by Gale.
- The one-pointer method, as described by Steve and Joe.
- The array of pointers to arrays method, as described by James.

I expect this thread will be referenced in the future by coders seeking to
work with 2D arrays. In fact, the thread could be posted to Pascal Central
as a how-to article.


-Vik


Date: Thu, 29 Apr 2004 11:02:02 +0200
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: robin cowan <r.cowan@merit.unimaas.nl>


I don't know if it is still true with later versions of CW, but it used
to be that there were a couple of bugs in the compiler that caused
runtime errors accessing elements in dynamic arrays.
So if you are getting runtime errors when trying to access the arrays
(like NaN or
-MaxLongInt in weird places) check the following:
If you want an array of size n, you have to declare it to be of size
n+2, otherwise you cannot access the last element.
In other instances you could only access every other element in the
array, so you have to declare it to be of size 2n, and then use only
the even (or odd, I don't think it matters, so long as you leave a
distance of 1 between them) elements.

These bugs applied to both 1 and 2 dimensional arrays.

Sorry I can't be more specific about the conditions under which they
apply, (it has been a while) but I experienced both of them, under two
different ways of creating the dynamic arrays.

I think Joachim Harloff first pointed this out to me.

Robin Cowan



On Wednesday, Apr 28, 2004, at 12:44 Europe/Amsterdam, Gale Paeper
wrote:

> Vik Rubenfeld wrote:
>>
>> I'm thinking of doing something along these lines to create a
>> dynamically
>> allocated 2D array:
>>
>> Type
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>> str[MaxFieldWidth];
>>
>> Var
>> RecordSize, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;
>>
>> {...}
>> Begin
>> {determine array size on the fly at runtime}
>> NumberOfRowsNeeded:= 75000;
>> NumberOfColumnsNeeded:= 2000;
>>
>> RecordSize:= MaxFieldWidth + 1; {+1 for the length byte of a
>> Pascal
>> string}
>> HandleToLarge2DArray := NewHandle(RecordSize * NumberOfRowsNeeded
>> *
>> NumberOfColumnsNeeded);
>>
>> FailNil(handle(HandleToLarge2DArray));
>> MoveHi(handle(HandleToLarge2DArray));
>> Hlock(handle(HandleToLarge2DArray));
>>
>> {...}
>> End;
>>
>> Now here's the interesting part. To get this to map correctly to the
>> ram
>> allocated by the call to NewHandle, do I access this array like this:
>>
>> HandleToLarge2DArray^^[aRow, aColumn]:= 'a string';
>
> That isn't going to work. You have too many dynamically sized
> components for the static, compile-time sized type. For a static type,
> you can have at most a dynamically sized first dimension and still have
> the compiler generated array indexed addressing work correctly. (The
> compiler doesn't know the other dimensions are varying in size at
> run-time so the addressing used is for the type's static, compile-time
> size for the additional components beyond the first array dimension.)
>
> If you want dynamic sizing, schema types is just what you need since
> the
> compiler uses the dynamic, run-time established sizes of the components
> in calculating component addresses.
>
> This would be much easier with GPC with its direct support for Extended
> Pascal's string schema type and its more comprehensive support (also
> without the bonus of CW bugs) for Extended Pascal's schema type
> features. It can still be made to work with CW Pascal's limited schema
> types support with some type coersion hacking and programmer discipline
> to ensure the code always stays within the bounds of the dynamically
> sized string allocated storage.
>
> The type coersion hacking is needed to work around the inherent static,
> compile-time sizing for string type declarations. For run-time varing
> sized string storage, some other type declaration must be used which CW
> Pascal will accept for a dynamically sized schema type declaration.
> The
> critical part of the alternative, other type declaration which is used
> in allocating the storage for the run-time capacity sized string is
> getting the correct memory size footprint.
>
> The following short program shows how to do this with CW Pascal. (I've
> kept the sizes small for the array dimensions to keep the dianostic
> output at a reasonable size.) The dummy in the new call is a CW Pascal
> bug workaround. (CAUTION: If you change the T2DStrArray schema type
> declaration I used, be sure to check that the new code sets the actual
> schema discriminates correctly. I had originally started out with a
> slightly different type declations for T2DStrArray but encountered
> another bug in CW Pascal's implementation which ended up setting some
> the schema discriminate fields to bad values. Thus, the present
> T2DStrArray type declaration is partially a result of a bug
> workaround.)
>
> The type coersion hack with the pointer gymnastics is pretty ugly but
> that's what it took to get the compiler to accept type coersion for a
> string type to and from the T2DStrArray schema type.
>
> Also, be sure to have the "Copy Strings Using Length byte" option
> selected in the project target Pascal Language preference panel. With
> that option set, the compiler generates code to use the dynamic,
> run-time length of the string in copying the string bytes and thus
> keeps
> the copying within the bounds of the dynamically allocated sting
> storage. (Presuming, of course. that you haven't made any coding
> errors
> in string sizing.)
>
> Note: You don't necessarily have to use an intermediary string buffer
> like I used in the program. Just keep in mind that you get no compiler
> help in finding coding errors in the dynamically sized string storage
> area and the more complex the type coersion hacking the more difficult
> it is in figuring out what the root cause of the problem is if your
> code
> has bugs in it related to dynamically sized string storage.
>
> PROGRAM Schema2DArray;
>
> TYPE
> TIndex1 = 1..MaxLongInt;
> TIndex2 = 1..MaxLongInt;
> TStrCapacity = 1..255;
> {Last index 0..StrCapacity is storage area for string[StrCapacity].
> The 0 lower bound automatically adds in the string type hidden
> length byte.}
> T2DStrArray(M: TIndex1; N: TIndex2; StrCapacity: TStrCapacity) =
> ARRAY[1..M,1..N,0..StrCapacity] OF UNSIGNEDBYTE;
> T2DStrArrayPtr = ^T2DStrArray;
> Str255 = string[255];
> Str255Ptr = ^Str255;
>
> VAR
> My2DStrArrayPtr : T2DStrArrayPtr;
> StrBuffer: Str255;
> dummy: INTEGER;
> i, j: LONGINT;
> MaxFieldWidth, NumberOfRowsNeeded, NumberOfColumnsNeeded : LongInt;
>
> BEGIN
> {MaxFieldWidth, NumberOfRowsNeeded, and NumberOfColumnsNeeded can be
> calculated at run-time.}
> NumberOfRowsNeeded:= 5;
> NumberOfColumnsNeeded:= 10;
> MaxFieldWidth := 64; {with 0 lower bound this will allocate storage
> for string[64]}
>
> new(My2DStrArrayPtr,dummy,NumberOfRowsNeeded,NumberOfColumnsNeeded,MaxF
> ieldWidth);
> FOR i := 1 TO My2DStrArrayPtr^.M DO
> FOR j := 1 TO My2DStrArrayPtr^.N DO
> BEGIN
> StrBuffer := StringOf('Line ', i:1, ',', j:1);
> Str255Ptr(@My2DStrArrayPtr^[i,j,0])^ := StrBuffer;
> END;
> FOR i := 1 TO My2DStrArrayPtr^.M DO
> FOR j := 1 TO My2DStrArrayPtr^.N DO
> BEGIN
> StrBuffer := Str255Ptr(@My2DStrArrayPtr^[i,j,0])^;
> writeln(StrBuffer:length(StrBuffer));
> END;
>
> END.
>
> Gale Paeper
> gpaeper@empirenet.com
>


Date: Mon, 3 May 2004 16:31:43 +0800
From: Peter N Lewis <peter@stairways.com.au>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


> >From lurking the Carbon mailing list, apparently Pointers are preferred over
>Handles in OSX, due to the improved memory implementation in OSX.
>
>If I'm remembering the discussions correctly, they said that Handles are just
>'kludged' on-top of the improved Pointers, and Handles no longer offer
>memory-management advantages as they did in Classic MacOS.

This is not really true.

Handles no longer move underneath you like they used to, and no
longer are useful for compacting the heap and such thins to conserve
space, all of which is fine since that is no longer needed.

However, Handles do still offer advantages in memory management for
the user (ie, you) in that you can resize them while retaining a
"pointer" to them in other placers in your code.

Still, as you say, if you never need to resize them then pointers are
more appropriate.
Peter.


Date: Mon, 3 May 2004 16:35:55 +0800
From: Peter N Lewis <peter@stairways.com.au>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


>I'm thinking of doing something along these lines to create a dynamically
>allocated 2D array:
>
>
>Type
> PtrToLarge2DArray = ^Large2DArray;
> HandleToLarge2DArray =^PtrToLarge2DArray;
> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>str[MaxFieldWidth];

You will not be able to do that as the size of the structure would
exceed 2^32. Indeed, you cannot even do :

type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]

The first question to ask is what kind of use you will put. Will you
actually use all of the array, or only parts of it? Will your
strings be typically most of the size of MaxFieldWidth or typically
smaller. I presume that MaxFieldWidth is actually constant.

Also important is the speed requirements for your access. Will you
be accessing the elements frequently. Is memory space or speed more
important (remembering that if you waste too much memory you will
start thrashing the harddisk and speed will go out the window). Is
MaxFieldWidth long enough that copying the string around is going to
be a problem? Are you going to be changing the strings or once set
are they constant?

Personally, I would consider doing something like this:

const
MaxFieldWidth = whatever (make sure it is odd!)
type
MyField = string[MaxFieldWidth]
MyFields = array[0..MaxLongInt div SizeOf(MyField)] of MyField;
MyFieldsPtr = ^MyFields;
MyStructure = record
rows: longint;
cols: longint;
fields: MyFieldsPtr;
end;

function CreateMyStructure( var s: MyStructure; r, c : longint ): OSStatus;
procedure DestroyMyStructure( var s: MyStructure );
function MyStructureToIndex( const s: MyStructure; r, c: longint ): longint;
function MyStructureGet( const s: MyStructure; r, c: longint ): MyField;
function MyStructureSet( const s: MyStructure; r, c: longint; const
f: MyField );

CreateMyStructure would allocate space in fields for r*c fields.
DestroyMyStructure would dispose the space
MyStructureToIndex would return (r-1)*rows+c-1
MyStructureGet and MyStructureSet would use MyStructureToIndex and
assign or read the field.

Implementation is left up to the reader.

If the array is sparse (ie, you wont be using most of it), instead of
MyField being a string, make it a pointer to a string, and create
them as needed.

If the strings vary in size a lot, do the above and also create the
strings to the appropriate size.

Enjoy,
Peter.


From: "James Chandler Jr" <jchandjr@bellsouth.net>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
Date: Mon, 3 May 2004 11:46:10 -0400


- ----- Original Message -----
From: "Peter N Lewis" <peter@stairways.com.au>
To: "Subscribers of macpascal" <macpascal@avalon.pascal-central.com>
Sent: Monday, May 03, 2004 4:31 AM
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


> > >From lurking the Carbon mailing list, apparently Pointers are preferred
over
> >Handles in OSX, due to the improved memory implementation in OSX.
> >
> >If I'm remembering the discussions correctly, they said that Handles are just
> >'kludged' on-top of the improved Pointers, and Handles no longer offer
> >memory-management advantages as they did in Classic MacOS.
>
> This is not really true.
>
> Handles no longer move underneath you like they used to, and no
> longer are useful for compacting the heap and such thins to conserve
> space, all of which is fine since that is no longer needed.
>
> However, Handles do still offer advantages in memory management for
> the user (ie, you) in that you can resize them while retaining a
> "pointer" to them in other placers in your code.
>
> Still, as you say, if you never need to resize them then pointers are
> more appropriate.

Thanks, Peter. Good points.

I used Handles mainly for memory conservation, because the apps did lots of
allocation/deallocation during program flow. Got so accustomed to HLockHi() to
churn the handles and make sure some empty space stayed available for new
allocations.

If one keeps multiple references to the same handle, you are correct that
Handles might still offer advantages on OSX. Didn't think of that.

JCJR


Date: Mon, 03 May 2004 09:11:03 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 5/3/04 1:35 AM, "Peter N Lewis" <peter@stairways.com.au> wrote:

>> I'm thinking of doing something along these lines to create a dynamically
>> allocated 2D array:
>>
>>
>> Type
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>> str[MaxFieldWidth];
>
> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]
>
> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it? Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.
>
> Also important is the speed requirements for your access. Will you
> be accessing the elements frequently. Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window). Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem? Are you going to be changing the strings or once set
> are they constant?
>
> Personally, I would consider doing something like this:
>
> const
> MaxFieldWidth = whatever (make sure it is odd!)
> type
> MyField = string[MaxFieldWidth]
> MyFields = array[0..MaxLongInt div SizeOf(MyField)] of MyField;
> MyFieldsPtr = ^MyFields;
> MyStructure = record
> rows: longint;
> cols: longint;
> fields: MyFieldsPtr;
> end;
>
> function CreateMyStructure( var s: MyStructure; r, c : longint ): OSStatus;
> procedure DestroyMyStructure( var s: MyStructure );
> function MyStructureToIndex( const s: MyStructure; r, c: longint ): longint;
> function MyStructureGet( const s: MyStructure; r, c: longint ): MyField;
> function MyStructureSet( const s: MyStructure; r, c: longint; const
> f: MyField );
>
> CreateMyStructure would allocate space in fields for r*c fields.
> DestroyMyStructure would dispose the space
> MyStructureToIndex would return (r-1)*rows+c-1
> MyStructureGet and MyStructureSet would use MyStructureToIndex and
> assign or read the field.
>
> Implementation is left up to the reader.
>
> If the array is sparse (ie, you wont be using most of it), instead of
> MyField being a string, make it a pointer to a string, and create
> them as needed.
>
> If the strings vary in size a lot, do the above and also create the
> strings to the appropriate size.
>
> Enjoy,
> Peter.


HandleToPTDocStyleRunRecordArray = ^PtrToPTDocStyleRunRecordArray;
PtrToPTDocStyleRunRecordArray = ^PTDocStyleRunRecordArray;
PTDocStyleRunRecordArray = array[1..MaxLongInt] of
PTDocStyleRunRecord;

{style record- records all formatting for a given style run, or set of style
runs having the same style}
PTDocStyleRecord = record
TotalTabs: integer;
TabLocations: PTDocTabLocations;
CurrentPTDocStyle: PTDocStyles;
CurrentFontID: integer;
CurrentFontSize: integer;
CurrentColor: GRTextColorTypes;
CurrentAdjustSetting: PTDocAdjustSettings;
CurrentLeftIndentInPixels: integer;
VerticalLinesAreOn: boolean;
NumberOfVerticalLines: integer;
HlocsOfVerticalLines: PTDocHLocsOfVerticalLinesArray;
TabLocationsCheckSum: LongInt;
end;


Date: Mon, 03 May 2004 09:11:43 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


On 5/3/04 1:35 AM, "Peter N Lewis" <peter@stairways.com.au> wrote:

>> I'm thinking of doing something along these lines to create a dynamically
>> allocated 2D array:
>>
>>
>> Type
>> PtrToLarge2DArray = ^Large2DArray;
>> HandleToLarge2DArray =^PtrToLarge2DArray;
>> Large2DArray = array[1..MaxLongInt] of array[1..MaxLongInt] of
>> str[MaxFieldWidth];
>
> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]
>
> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it? Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.
>
> Also important is the speed requirements for your access. Will you
> be accessing the elements frequently. Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window). Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem? Are you going to be changing the strings or once set
> are they constant?
>
> Personally, I would consider doing something like this:
>
> const
> MaxFieldWidth = whatever (make sure it is odd!)
> type
> MyField = string[MaxFieldWidth]
> MyFields = array[0..MaxLongInt div SizeOf(MyField)] of MyField;
> MyFieldsPtr = ^MyFields;
> MyStructure = record
> rows: longint;
> cols: longint;
> fields: MyFieldsPtr;
> end;
>
> function CreateMyStructure( var s: MyStructure; r, c : longint ): OSStatus;
> procedure DestroyMyStructure( var s: MyStructure );
> function MyStructureToIndex( const s: MyStructure; r, c: longint ): longint;
> function MyStructureGet( const s: MyStructure; r, c: longint ): MyField;
> function MyStructureSet( const s: MyStructure; r, c: longint; const
> f: MyField );
>
> CreateMyStructure would allocate space in fields for r*c fields.
> DestroyMyStructure would dispose the space
> MyStructureToIndex would return (r-1)*rows+c-1
> MyStructureGet and MyStructureSet would use MyStructureToIndex and
> assign or read the field.
>
> Implementation is left up to the reader.
>
> If the array is sparse (ie, you wont be using most of it), instead of
> MyField being a string, make it a pointer to a string, and create
> them as needed.
>
> If the strings vary in size a lot, do the above and also create the
> strings to the appropriate size.
>
> Enjoy,
> Peter.



Date: Mon, 03 May 2004 09:23:46 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


> You will not be able to do that as the size of the structure would
> exceed 2^32. Indeed, you cannot even do :
>
> type Large1DArray = array[1..MaxLongInt] of str[MaxFieldWidth]

I do something that appears to be along those lines:

type PTDocStyleRunRecordArray = array[1..MaxLongInt] of PTDocStyleRunRecord;

...as in:

{style record- records all formatting for a given style run, or set of style
runs having the same style}
PTDocStyleRecord = record
TotalTabs: integer;
TabLocations: PTDocTabLocations;
CurrentPTDocStyle: PTDocStyles;
CurrentFontID: integer;
CurrentFontSize: integer;
CurrentColor: GRTextColorTypes;
CurrentAdjustSetting: PTDocAdjustSettings;
CurrentLeftIndentInPixels: integer;
VerticalLinesAreOn: boolean;
NumberOfVerticalLines: integer;
HlocsOfVerticalLines: PTDocHLocsOfVerticalLinesArray;
TabLocationsCheckSum: LongInt;
end;

HandleToPTDocStyleRunRecordArray = ^PtrToPTDocStyleRunRecordArray;
PtrToPTDocStyleRunRecordArray = ^PTDocStyleRunRecordArray;
PTDocStyleRunRecordArray = array[1..MaxLongInt] of
PTDocStyleRunRecord;

...and it's always worked well.

> The first question to ask is what kind of use you will put. Will you
> actually use all of the array, or only parts of it?

I will use all of the array, up to the total number of rows and columns
allocated when the array is created.

> Will your
> strings be typically most of the size of MaxFieldWidth or typically
> smaller. I presume that MaxFieldWidth is actually constant.

Typically smaller. MaxFieldWidth is a constant.

> Also important is the speed requirements for your access.

Speed of access is very important.

> Will you
> be accessing the elements frequently.

Yes.

> Is memory space or speed more
> important (remembering that if you waste too much memory you will
> start thrashing the harddisk and speed will go out the window).

Speed.

> Is
> MaxFieldWidth long enough that copying the string around is going to
> be a problem?

MaxFieldWidth has never exceeded 255 characters.

> Are you going to be changing the strings or once set
> are they constant?

There is a need to change them.

> Personally, I would consider doing something like this:

[....]

I hadn't considered an array of pointers to strings, which conserves memory,
and even permits strings greater than MaxFieldWidth if necessary.

Is there a speed hit from accessing the strings via pointers?

Thanks very much for this excellent advice.

-Vik


Date: Tue, 4 May 2004 13:40:07 +0800
From: Peter N Lewis <peter@stairways.com.au>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


>I do something that appears to be along those lines:
>
>type PTDocStyleRunRecordArray = array[1..MaxLongInt] of PTDocStyleRunRecord;

>...and it's always worked well.

It probably is fine, most of the time. But
SizeOF(PTDocStyleRunRecordArray) is likely a bogus number, quite
probably MacLongInt*SizeOf(PTDocStyleRunRecord) mod 2^32, which could
be a negative number or indeed pretty much any number. In fact, for
even sizes of PTDocStyleRunRecordArray, it will actually be
2*MacLongInt-SizeOf(PTDocStyleRunRecord), so you are probably
actually safe as long as the compiler always uses unsigned arithmetic
for the size. But opnly by fluke which is never a good way to
program. If for example PTDocStyleRunRecordArray was 1k bytes and
for some reason you declared it as:

type PTDocStyleRunRecordArray = array[1..4194305] of PTDocStyleRunRecord;

then SizeOf(PTDocStyleRunRecordArray) would actually be 1024 which
since it was less than 23k could result in the compiler getting
confused and using short integer arithmetic for the array access.

> > The first question to ask is what kind of use you will put. Will you
> > actually use all of the array, or only parts of it?
>
>I will use all of the array, up to the total number of rows and columns
>allocated when the array is created.

Hmm, looking back at your original query, you've got:

NumberOfRowsNeeded:= 75000;
NumberOfColumnsNeeded:= 2000;

Which means that rows*cols is 140 million. That would mean 140 Meg
for each byte of MyField. If that is really the size of your array,
you're probably going to be in trouble since you will rapidly run out
of RAM and address space even if you don't run out of virtual memory.

> > Will your
>> strings be typically most of the size of MaxFieldWidth or typically
>> smaller. I presume that MaxFieldWidth is actually constant.
>
>Typically smaller. MaxFieldWidth is a constant.
>
>> Also important is the speed requirements for your access.
>
>Speed of access is very important.
>
>> Will you
>> be accessing the elements frequently.
>
>Yes.
>
>> Is memory space or speed more
>> important (remembering that if you waste too much memory you will
>> start thrashing the harddisk and speed will go out the window).
>
>Speed.
>
>> Is
>> MaxFieldWidth long enough that copying the string around is going to
>> be a problem?
>
>MaxFieldWidth has never exceeded 255 characters.

Unless you don't expect rows and cols to get quite as big as
indicated, you're actually wrong, you need to concentrate on memory
because you aren't going to fit it all in unless you're careful, in
which case the trivial issues of access time you are talking about
here will be irrelevant while you read/write the array to virtual
memory or simply fail to fit it in memory.

>I hadn't considered an array of pointers to strings, which conserves memory,
>and even permits strings greater than MaxFieldWidth if necessary.
>
>Is there a speed hit from accessing the strings via pointers?

Not really, the indirection is almost irrelevant if you're working on
strings. But what would be relevant is for each pointer, you would
have the four bytes of pointer, and probably a bunch of overhead from
the system memory manager - with 75000 * 2000 array, you'd probably
be out of memory/address space with MaxFieldWidth at 1.

Enjoy,
Peter.


Date: Mon, 03 May 2004 23:49:02 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


> It probably is fine, most of the time. But
> SizeOF(PTDocStyleRunRecordArray) is likely a bogus number, quite
> probably MacLongInt*SizeOf(PTDocStyleRunRecord) mod 2^32, which could
> be a negative number or indeed pretty much any number.

Very informative. Thank you, Peter.

> Hmm, looking back at your original query, you've got:
>
> NumberOfRowsNeeded:= 75000;
> NumberOfColumnsNeeded:= 2000;

It's for a project that is currently obtaining data at a rate of about 5000
records (i.e. rows) per year. So in five years (planning for the future)
there could reasonably be 25000 rows. There are currently 1850 fields per
record, so let's call that 2000 columns.

Following your post this morning, I'm currently looking into the possibility
of using pointers so as to reduce the amount of memory used by the data
itself. The data itself will probably occupy about 24 megabytes per year,
based on the amount of data obtained for the first 3 months of the project.
(Why only 24 megs of actual data allocated across 10 million fields? The
answer is, most fields contain very little data, and there are a large
percentage of empty fields as well.)

So in 5 years there may be 25000 * 2000 = 50 million pointers, pointing to
120 megs of data.

50 million pointers * 4 bytes each = 200 million bytes of ram, or a bit
under 200 megabytes, just for the pointers. 200 megs plus the 120 megs of
actual data is around 320 megs of ram, which is not excessive.

I wonder if OS-X can deal with 50 million pointers.

Of course, there's the question of what sort of data structures to use to
keep track of 50 million pointers. :) I've been writing code for that today,
based on the excellent advice I have received on this subject from this
list.

Does this appear to make the approach I'm considering more practical?


-Vik



Date: Tue, 4 May 2004 15:30:26 +0800
From: Peter N Lewis <peter@stairways.com.au>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


>Following your post this morning, I'm currently looking into the possibility
>of using pointers so as to reduce the amount of memory used by the data
>itself. The data itself will probably occupy about 24 megabytes per year,
>based on the amount of data obtained for the first 3 months of the project.
>(Why only 24 megs of actual data allocated across 10 million fields? The
>answer is, most fields contain very little data, and there are a large
>percentage of empty fields as well.)

Ahh, ok, that was what I was asking when I mentioned "Will you
actually use all of the array, or only parts of it?". If many fields
are empty, then you are not using the whole array (that is then
called a sparse array).

>
>So in 5 years there may be 25000 * 2000 = 50 million pointers, pointing to
>120 megs of data.
>
>50 million pointers * 4 bytes each = 200 million bytes of ram, or a bit
>under 200 megabytes, just for the pointers. 200 megs plus the 120 megs of
>actual data is around 320 megs of ram, which is not excessive.
>
>I wonder if OS-X can deal with 50 million pointers.

I would find a way to avoid using system pointers. The reason is, a
system pointer does not in fact take up 4 bytes. Since a pointer
knows its size (GetPtrSize requires it) it must take up at least 8
bytes. And it might well take up more than that with other
housekeeping stuff, perhaps 16 bytes or more.

Instead of using real pointers, what you probably will want to do is
write your own memory manager designed specifically for this task
using large blocks of memory allocated from the system.

So you your probably still allocate your array of pointers as before
as one big block, but for your fields you would allocate big chunks
(or indeed for your purposes you could probably just allocate one big
chunk of 120Meg and that will work until your needs exceed it and
then just just allocate more).

Now you have to write your own NewField and DisposeField.

One relatively simple way to do this is to have one big block of
memory (120Meg) which you allocate your fields in.

If you never had to dispose any fields (ie, you never changed fields
- or at least never resized them), then all you would need is a
single index into this block and each time you allocate a new field,
the pointer is @fields[index] and then you increment index by the
size of the required field.

But if you need to dispose of fields too, then life is somewhat more
complicated.

One way to do this is to have free fields linked based on their size.
So you have an array[4..MaxFieldSize] of Pointer, each pointer
pointing in to the memory space and then linked from there to the
next free block of the same size. So to allocate a new block of size
N, you look at the freepointer[n] entry and if that is not nil, use
that pointer and set freepointer[n] to the linked through value.
Otherwise allocate a new pointer as described above. To free an
entry, set its contents to be freepointer[size] and set
freepointer[size] to be the freed link.

It would be easier to understand with a diagram, so try drawing out
the memory layount to understand what is going on.

So say for example you have your memory space for fields defined as:

const MaxMyFields = 120000000;
type MyFields = packed array[0..MaxMySpace] of char;
MyFieldsPtr = ^MyFields;

fields: MyFieldsPtr;
freespace: longint;
freepointers: array[4..MaxFieldSize] of Pointer;

new(fields);

clear freespace and freepointers;

NewField( requiredsize ) is
requiredsize := max(4,requiredsize);
if freepointers[requiredsize] <> nil then begin
result := freepointers[requiredsize];
freepointers[requiredsize] := PointerPointer(freepointers[requiredsize])^
end else begin
result := @fields[freespace];
freespace += requiredsize
end;

DisposeField( f: Pointer ) is
thissize := ByteAt( f );
thissize := max(4,thissize);
PointerPointer(f)^ := freepointers[thissize];
freepointers[thissize] := f;

or something along those lines.

It is prtty simple, it just requires a bit of thought and some
drawing to see how it all works.

Enjoy,
Peter.


Date: Tue, 04 May 2004 01:38:28 -0700
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?
From: Vik Rubenfeld <vikr@mindspring.com>


> Ahh, ok, that was what I was asking when I mentioned "Will you
> actually use all of the array, or only parts of it?". If many fields
> are empty, then you are not using the whole array (that is then
> called a sparse array).

I started considering using sparse arrays after reading your posts on this
thread. :)

> I would find a way to avoid using system pointers. The reason is, a
> system pointer does not in fact take up 4 bytes. Since a pointer
> knows its size (GetPtrSize requires it) it must take up at least 8
> bytes. And it might well take up more than that with other
> housekeeping stuff, perhaps 16 bytes or more.

I had no idea.

> Instead of using real pointers, what you probably will want to do is
> write your own memory manager designed specifically for this task
> using large blocks of memory allocated from the system.

Excellent.

> If you never had to dispose any fields (ie, you never changed fields
> - or at least never resized them), then all you would need is a
> single index into this block and each time you allocate a new field,
> the pointer is @fields[index] and then you increment index by the
> size of the required field.

I don't need to change the contents of the fields. This makes it much easier
per your description.

This is great info. I will be implementing all this tomorrow. Thanks very
much, Peter.

-Vik


Date: Thu, 6 May 2004 19:21:41 +0200
From: Jean-Yves Bernier <bernier@pescadoo.net>
Subject: Re: Accessing Records in Dynamically Allocated 2D Arrays?


At 09:49 -0700 28/04/2004, Vik Rubenfeld wrote:

>The thread now identifies 3 ways of producing dynamically-allocated 2D
>arrays:
>
> - The Schema method, as described by Gale.
> - The one-pointer method, as described by Steve and Joe.
> - The array of pointers to arrays method, as described by James.

OOP can give a nice solution:


type DynamicArrayOfSomeType = object
width, height: longint;
data: Handle;

procedure Set (x,y:longint; value: SomeType);
function Get (x,y:longint): SomeType;
end;

procedure DynamicArrayOfSomeType.New (w,h: longint);
begin
self.width := w;
self.height := h;
self.data := NewHandle(w*h*sizeOf(SomeType));
end;

procedure DynamicArrayOfSomeType.Set (x,y: longint; value: SomeType);
var p: Ptr;
begin
p := Ptr(ord(data^)+(x+y*self.width)*sizeOf(SomeType));
SomeTypePtr(p)^ := value;
end;

function DynamicArrayOfSomeType.Get (x,y: longint): SomeType;
var p: Ptr;
begin
p := Ptr(ord(data^)+(x+y*self.width)*sizeOf(SomeType));
Get := SomeTypePtr(p)^;
end;


That's what I use in CW Pascal to do image processing.


[Return to 2D Arrays Main]



Copyright © 2004, MacPascal Mailing List