Showing posts with label Arbed. Show all posts
Showing posts with label Arbed. Show all posts

31 October 2015

Searching multiple Xojo projects

Finding that useful method or class in one of your numerous Xojo projects


If you've used Xojo (Real Studio) for a while, you'll probably have collected more than just one project.

And each of these projects contains unique code, and some of that may even be re-usable for other projects.

How do you keep track of all the code and methods you've written in the past? E.g, you do remember you've once written that nifty function to count the words in a string, but where is it?

I am going to show you several ways how you can find code across multiple projects.


Mac OS X only: Use Spotlight


For Spotlight to be able to search Xojo projects, it needs to be able to understand their content so that it can extract any text from it and add it to its searchable database. By default, textual project formats (.xojo_project, .xojo_xml_project, .xojo_script, .xml, .rbvcp, .rbs) are regarded as text files and thus be automatically scanned by Spotlight. However, it won't scan binary file formats such as .xojo_project and .rbp - to make them usable by Spotlight, a so-called Spotlight Importer is needed.

Real Studio and early Xojo IDEs did include such an importer, but recent versions don't any more. However, as I wrote this importer originally, I've now made the effort and updated it for the latest Xojo file format (as of Xojo 2015r3). So, if you're on a Mac I suggest you install this latest importer for Xojo if you like to use Spotlight:


This Spotlight importer will remember all the class and function names of all your projects, along with any other text it they contain.

Let's assume you have a method names "CountWords". If you enter "countwords" into the Spotlight search field, you should see the project file listed in the results.

However, the results may be cluttered with a lot of other types of files not related to Xojo.

So, here's how you tell Spotlight to only show Xojo project files:
  • Open the Spotlight Finder window (e.g. by pressing cmd+F in the Finder).
  • If you do not see a popup menu under the Search: row, click the [+] button on the right.
  • Click on the leftmost popup menu (which probably reads Kind) and choose Other...
  • A new sheet dialog appears in which you can select a search attribute.
  • Find the "File extension" attribute. Tick its checkbox under the In Menu column.
  • While you're at it, also find the "Xojo class" attribute and tick its checkbox as well.
  • Then click OK to dismiss the dialog.
Now, whenever you use this Find window, you can choose File extension from the popup menu to limit the found items to file types that your Real Studio projects use: rbprbbas, xojo_binary_project etc.

Furthermore, if you are looking for a particular class, you can choose the Xojo class attribute and enter the (partial) class name there, leaving the main search field empty. That way, you'll see all the projects that contain that class.


All platforms: Searching inside VCP and XML projects


If you are used to saving your projects in the textual VCP (.rbvcp or .xojo_project and related class files) or XML (.xml or .xojo_xml_project) format, then you may have success performing simple searches using Spotlight on OS X or Windows Search on Windows.

If you need more control over what files are found, so that you don't get lots of false results from non-RB files, here are a few 3rd party programs you could try:

OS X

  • TextWrangler is free and has a Multi-File search in which you can set up filters with the extensions you want to search.
  • EasyFind mainly finds files by names, but if you specify to search files ending in .rbbas, .xojo_class etc., then you can also search for their content. (I'd have liked to tell you that my program Find Any File could also be of help here, but it doesn't search file contents - yet.)

Windows

  • UltraFileSearch does a good job. You'd use the Wildcards search in the Files and Folders tab, searching for "*. xojo_*", and then enter the name of the function or code under the Containing Text tab.
Unfortunately, most of these methods (apart from TextWrangler), including Spotlight, won't show you the actual code snippets, and the Xojo IDE also won't open ".xojo_window" and similar files alone, so you'll have to find their main project file and double click that, then use the IDE again to search for the function name.


All platforms: Arbed


Arbed is a tool that was developed mainly for dealing with larger projects and the tasks around them. One of its many features lets you search inside project files within a folder and all its sub folders (and while some features require a license purchase, this is a free feature).

Contrary to Spotlight, Arbed runs on Windows and Linux as well, so you can use this even if you do not use a Mac.

The advantage of Arbed over the above methods is that Arbed can show you the results conveniently, so you don't have to re-enter the search in the IDE before you know if the results contain what you were searching for.

Launch Arbed and then either drop a folder onto its Search Multiple Projects box or choose Find on Disk... from the File menu:


You can then enter the search text, even as a Regex formula, and also choose the folder in which you want to search:


Arbed then lists all found projects:


Double clicking any of the found items opens a new window showing that project, along with a window listing all occurances. Clicking on any of the occurances will show you the related project item (class, function etc.):


Right-Clicking on an item in the project list gives you the option to reveal the file on disk or open it in the Xojo / Real Studio IDE.

01 September 2013

Lessons in working with file formats in flux

As you may know, my tool Arbed for Xojo (formerly known as Real Studio) is capable of reading and modifying Xojo's project files. This ability is used to perform often needed operations that the Xojo IDE doesn't offer (yet), such as comparing projects and single classes, scripted code replacement, code obfuscation, preparing code for localization and more.

I'm going to tell you a little about the challenges I have to deal with.

There are 3 different project formats supported by Xojo:
  1. Binary, aka RBP, aka RbBF, using extensions .rbp and .xojo_binary_project (sheesh!)
  2. XML, using extensions .xml and .xojo_xml_project
  3. Textual, aka VCP, using extensions .rbvcp and .xojo_project

Dealing with RBP and XML


The first two formats are structurally identical: The RBP format is nothing more than a more compact version of the XML format, using a binary representation of the XML by using 4-letter-codes for the tags and 4 byte length fields to indicate the size of the elements that follow. (Xojo engineers informed me that actually RBP was designed first, and XML is a translation of the RBP format.)

Arbed originally only supported these first two formats, and it was pretty easy to handle. The data was well structured and repetitive in a way that gives little room for (problematic) surprises. If you've ever looked at a xml (or html) file, you probably understand what I mean.

When Arbed reads a project in the RBP and XML formats, it modifies only the parts it well understands while leaving all the other data untouched. For instance, when it changes some source code inside a function, it only modifies the affected <SourceLine> elements. That way, it can safely modify the project even if Arbed doesn't understand everything that's inside the project. And believe me, even though XML is apparently self-explanatory, there are a lot of unexplained things in there that just leaves one guessing.

The blame for this can be put with Xojo for not documenting this. I guess they do not even have an internal documentation. As so often in rushed software engineering, the code is the only documentation (for that, it doesn't even deserve the term "engineering"). When I am about to write complex code, I usually start with documenting and specifying it first, so that I (and others that might join the project) can later reder to that. It really helps, and should be obvious. However, most people do not follow this simple rule. Many even never learn. It even starts with documenting your single subroutines. See my little article on coding guidelines.

Converting between RBP and XML formats.


It gets a little more complicated when using Arbed's Convert operations. They's available in the main window (titled Arbed Drop Pad). For instance, to convert a RBP to a XML file, it has to know how the 4-letter-codes in the RBP file translate to the XML tag names.

It happens every once in a while that Xojo adds new features to the project file format that lead to using newly names element tags. For instance, when the Web Edition was added, a new XML tag named WebApp got added. Now, while Arbed doesn't need to know what this value means, it needs to know how to translate it between the XML tag name and the RBP tag code. Therefore, when you're using an Arbed version that doesn't know this translation yet, it'll tell you about it if you ask it to convert a newer project file with yet-unknown tags in it. I will then have to update Arbed with the new code (which is easy to do, I just have to save the same project in both XML and RBP format in the IDE and see look for the code in question in both files).

Overall, working with RBP and XML projects in Arbed is therefore fairly safe and fast.

Arbed only rarely needs updates, e.g. when a new tag code gets added, and then only to make the Convert functions work (which are not even really necessary because you can just use the IDE to save a project in a different format).

Dealing with the VCP format


The VCP is a different and much more scary beast.

In theory, it's just plain RbScript code, which is fairly well specified (even though Xojo has failed to present a syntax/grammar spec for their language for 15 years now, it's fairly well understood by me by now, and my background in compiler design helps there, too).

I even have a fully working RbScript parser that I hope to use soon to implement some great features such as method name obfuscation (i.e. rename all custom method names in your source in order to foil reverse engineering attempts), automatic code reformatting and dead code identification and removal.

But the reality is harsh. The VCP format has a lot of inconsistencies and hard-to-understand behaviors that make parsing it challenging.

Some examples:

Properties of Controls vary in representation


Recent Real Studio versions tended to write some integer values of Control properties as floating point numbers. If you're using version control you may have noticed that a control's Width and Height were sometimes shown as 1.6e2 when you did input 160. Recent Xojo version seem to have finally fixed this.

Odd spacing in method declarations


This is how a normal Method declaration looks like when using qualified type identifiers:
Sub Foo(x as a.b)
And this is how an External Method looks like:
Declare Sub extmethod Lib ""  Foo(x as a . b)
Note the inserted blanks around the period. They're syntactically allowed but I didn't type them - the VCP format inserted them automagically.

Troubles with Attributes


Let's enter 3 attributes with, admittedly, some unusual values:
  • test: a
  • t2: \x22 (which gets automatically quoted into: "\x22")
  • t3: b=," (which gets automatically quoted into: "b=,""")

Now let's see how they look like in the VCP format.

This is how a normal Method looks like:
Attributes( test = a, t2 = "\x22", t3 = "b=,""" )  Sub Foo()
Alright. That's fairly readable. (Though what's with those spaces inside of the parentheses? Also note the double space before "Sub".)

Here's a similar Delegate declaration:
Attributes( test = a, t2 = "\x22", t3 = "b=,""" ) Delegate Sub Foo()
Looks pretty identical, doesn't it? And now we can also understand that double space: It makes room for plugging a "Delegate" word in there :)

But wait. Method declarations are surrounded by #tag markers, providing extra information that the RBP and XML formats usually store in extra elements or attribute fields. Here's the one for a normal Method:
#tag Method, Flags = &h0
Okay. Now the one for a Declare:
#tag DelegateDeclaration, Flags = &h0, Attributes = \"test \x3D a\x2C t2 \x3D "\x22"\x2C t3 \x3D "b\x3D\x2C""""
Uh, what? Not only it includes the Attributes that are already present - and more much readable - in the declaration source code line, but it's only appearing in this special "delegate" method declaration but not in a normal method - even though they have the same syntax and should thus be created the same way.

But it gets even weirder: The above was created in the latest Xojo IDE. When doing the same with Real Studio 2012r2.1, the above line does not include the Attributes in the #tag line. So, someone must have accidentally added this nonsense just recently, and only to Declares, not to normal functions. Or maybe it wasn't an accident. In any case, it's quite a mess.

Encoding horrors


As a final example, let's focus on the odd encoding of attributes in the #tag line. It's obvious that it tries to escape some codes so that they may also be used inside the attribute values. It looks like a homemade algorithm, but appears to work pretty well. As you can see above, I've used characters as values that it also uses for escaping to see if I can break it. But I couldn't. That is, almost. When I set the value of an attribute to something containing line feeds (returns), then the IDE crashes hard or the compiler spills out inexplicable error messages. Oh, and later I found that if you put \x22 into an attribute's value, it'll be converted to a comma next time you open the project. Doesn't happen in RBP format, of course.

Sure, the test with the return in the attribute value was something that's unlikely to happen, but it shows that the whole thing is put together with little understanding how escaping random data should be done. Heck, since the Xojo IDE is programmed in Xojo, they could have have just base64-encoded the string, or used quoted-printable encoding. That's what they're meant for, and both are functions that Xojo provides anyway! But no, a new technique had to be invented: Something unreadable and prone to crashing.

What this all means


Why do I even bother with this painful stuff?

Well, here's the thing: In 2012, I enhanced Arbed to read and write the VCP format.

It was one of the worst decisions I had made in past years, because it took me several months of intensive work to get even close to being usable. And since then I spent many more days working out all the kinks. These kinks are what you see above: Hardly anything that looks similar also behaves similarly. Behaviors change in different RS and Xojo versions, sometimes without making sense. I had to identify and special-case them all, for each single IDE release, so that Arbed generates the exact same output that the IDE does.

I am still not sure that it was worth the effort, because I do not use the VCP format for myself, and until then, all the features I added to Arbed were needed by myself. But when I started selling Arbed, I believed I had to add this feature because many expected it to just work.

Now, every time Xojo changes the output format of the VCP files only slightly, it might mean that my own code could miss it. For instance, some files use undocument flag values, such as &h1000. No idea what it's for. Formerly, I had just generated these flags from the declaration source line. But now, sometimes, the Flag in the #tag line contains more information and I have to preserve it. Special handling galore!

Arbed's VCP output needs to match that of the IDE exactly


The issue is that, contrary to how Arbed can edit just specific parts of a method's source code in the RBP and XML formats, it can't do this with VCP files. Arbed's project modification code was written to work on the RBP/XML format directly, in order to keep anything unmodified that the user doesn't directly alter. For this functionality to work with VCP files, Arbed has to convert it internally into RBP format, so that my existing code can operate on it. Then, when the user saves his changes, Arbed recreate the VCP file from the updated RBP code (Arbed only rewrites the files that were affected by the changes). But that means that I have to recreate the entire VCP file from RBP data, instead of just modifying the source lines or whatever else was changed by the user as I did with RBP/XML files.

All this requires that Arbed understands every detail of a VCP file in order to properly convert it to RBP and back, internally.

Potential for damage


And that's what bit me a few times already in the past. I would miss small changes in the format, such as that color constants were originally written as hex values (&h...), but recently Xojo changed this to using &c... codes instead. My code was not prepared for this and would then write any color back as a 0 (black).

Basically, Arbed could, unknowingly, damage your VCP files if you edited and saved them in Arbed. You'd notice eventually, but Arbed should notice this before it does such damage.

So, how do you make sure that your code that reads and writes an external file format doesn't damage it just because the file format introduces changes that you are not aware of, yet?

The solution is: When reading the input, recreate the would-be output from it right away and then verify that both match. If there's a mismatch, your code is prone to damaging the file when writing it back.

Simple as that.

Arbed 1.7 is made safer by performing a self-test on any VCP file it reads.


Therefore, from Version 1.7b9 on (which I'll release shortly), Arbed will verify its own VCP read/write code to make sure that it entirely understands the project it reads and (optinally) writes.

So that, when you save a project, anything you didn't specifically change, will remain the same in the rewritten VCP file. So that when you use version control (git, mercurial, subversion) with your VCP files, it won't show lots of unrelated changes after Arbed wrote the file back (even if it's still valid to Xojo). Something Arbed may do even better than the Xojo IDE itself (ever seen those TabStops disappear and reappear randomly?)

Arbed accomplishes this self-check by converting the read project into its internal RBP format, then back the VCP representation, then comparing its VCP output with the original file. If it finds a mismatch, it warns the user that the data was not fully understood. It also creates a file on the user's Desktop containing the specifics of the rendering differences, so that the user can tell what's gone wrong (allowing the more experienced user to decide if the changes are benign), and if the user is sending the file to me, I can quickly fix Arbed and release a new version that deals with it.

Of course, if Arbed reports a problem with its VCP conversion, one can always just use the Xojo IDE to make the conversion: Open your project in Xojo, and use the Save As... menu command to save the project in the binary or XML format. Then use that file with Arbed, and if you've made changes in Arbed and saved them back to the project file, open that again in Xojo and use again Save As... to save it in VCP format. It's tedious but that's the way it's been working even before Arbed added support for VCP projects.

16 August 2013

Updates for Arbed, Zip Classes and CustomEditField

My Zip Classes for Xojo / REALbasic have been updated to v3.3.2, fixing a bug on Linux and improving the demo to handle the case when the destination zip can't be created, letting the user choose a new destination folder.

Arbed, my project editor for REALbasic projects, is currently in beta for supporting Xojo projects. I've just released v1.7.0b7 that fixes a critical issue when saving modified Xojo projects in VCP (textual) format (using extensions .rbvcp or .xojo_project): With previous versions, it could happen that colors of controls and windows would be reset to black (i.e. when a color property uses &c instead of an integer or hex (&h) number for the color code).

Oh, and CustomEditField was recently updated as well, mainly adding improvements on syntax highlighting for REALbasic code.

07 August 2013

When I use too many programs that I also maintain

This post has nothing particular of interest, I'm just venting because I need a break. Read on to understand why:

I'm very detail oriented when I program. And when I see a critical bug, I need to fix it right away. I get so hooked on it that I often go into long sessions, and I'll be miserable if I have to stop in the middle of it.

Now, here's an example of what often happens then:

I am working on a paid project for a client. It has to do with recovering data from a complex file structure.

The data structures are in binary format, so I need to be able to read the data inside.

Since I have my own disk editor, iBored, I start writing a template for this file format. Since iBored's template system is a work in progress, it naturally leads to me having to add new code to iBored to suit a particular new construct in the template syntax. So I add new code to iBored.

The template system uses RbScript so that I can perform calculations to decypher complex data structures. RbScript is fairly limited by default, though. For instance, there is no sort function.

Which means that I have to write a sort function in RbScript. I need a RbScript editor. Real Studio's own Script editor sucks enormously - it doesn't even have Undo. Fortunately, there is Arbed. It has a better RbScript editor.

While writing my script in Arbed, I notice that its syntax parser doesn't indent Interfaces correctly. I like to fix that. I can, because Arbed is another tool of mine.

The RbScript editor and syntax highlighter is coming from the open source class "CustomEditField", written by Alex Restrepo. He has stopped working on it. Coincidentally, I took over.

Thus I am working several hours on the CustomEditField open source project to fix its indentation code, which is fairly convoluted (partly my fault). Eventually I get this done.

Next I need to merge the fixes of the CustomEditField project into the Arbed project file. Naturally, I use Arbed for this.

Merging takes a while because I had recently added new features to CustomEditField directly in Arbed, without merging those improvements back into the open source version. Meaning that I have to merge some code from CEF to Arbed, and other code in the other direction. I have to do this carefully. But eventually, I get all the changes merged into both projects.

When I try to save the updated projects, Arbed gives me an error message: wrong id in block header.

Great. So I have to find this bug. Takes me another 2 hours. It was very very well hidden.

That's where I am as of writing this blog post.

Now, I can go back to merge the changes between the projects once again. Then I can hopefully continue writing the Sort function in RbScript, after which I can finish the iBored template to view the data so that I can write the code I'm getting paid for.

Programming is fun. But so exhausting when you care too much.