TYPO3 4.5 beta1

November 18, 2010, 1:48 pm

≫ Next: Release reviews (templavoila, imagemap_wizard, workspaces)

As announced on news.typo3.org yesterday the TYPO3 Core team published TYPO3 4.5 beta. With this beta the features for the next version are fixed and the remaining time until the release will be used for polishing and bug fixing.
A complete list of the included features and changes can be found on forge.typo3.org.

As member of the TYPO3 workspace team I’d like to encourage everyone to send as much feedback as possible for all the workspace related changes and since maintaining TemplaVoila is also still on my list, it would be great if you let me know if you find any TemplaVoila related issues with the new TYPO3 version.

If you’d like to stay up-to-date with the latest changes in the TYPO3 Core, without reading all the newsgroup messages, you might be interested to use the TYPO3 trunk Changelog RSS feed. Besides the feed there’s a great service from the Core team, they started to publish the protocol of their weekly meeting as “Minutes from the meeting of the release 4.5 team” to the typo3.project.v4 newsgroup. I hope this will be kept up in the future.

↧

Release reviews (templavoila, imagemap_wizard, workspaces)

February 24, 2011, 1:17 pm

≫ Next: Signal / Slots in Extbase

≪ Previous: TYPO3 4.5 beta1

I just push the TER upload button two times and in addition to that TYPO3 4.5.2 will be released tomorrow containing some nice workspaces updates. So here’s a short summary what happened in the extension releases.

TemplaVoila 1.5.4
The current release focussed on 4.5 compatibility. It uses the new sys_language flag “format” to support sprites, it hooks into the new backend-form (TCA) “layout” and add’s it’s fields to the right tabs within the backend forms and adjusts everything to work fine with the new CSRF mechanism.

Besides that bug fixes for the section index, performance improvements and a couple more are included.

One thing in conjunction with 4.5 you should be aware of is that copied elements are hidden by default. In older versions hidden elements won’t show up in the page module by default and therefore it might seem that nothing was copied, but that’s not right. With 1.5.4 the default setting was changed so hidden elements will show up in the page module. Unfortunately the old setting (to skip hidden elements) might still be present in your session settings – so please either clear your session settings or use the “Advanced function” tab in the page module to change to setting and avoid confusion.

Imagemap_wizard 0.6.0
The last versions proved to be very stable and with some additional sponsoring I was able to improve the DAM and TYPO3 workspaces support. Besides that a couple of issues which showed up in 4.4 and 4.5 were fixed. One of the next features will hopefully be a useful point of interest implementation – keep your fingers crossed that someone’s clients want’s to sponsor some time for that

Workspaces 4.5.1 / 4.5.2
Even if it’s shipped with the Core and included in the official release notes, here’s my summary of the improvements. The workspace module itself brought us lot’s of good feedback and also the new workspace preview raised some attention. Even though 4.5.0 was quite stable we weren’t able to get it working perfectly. The fixes made for 4.5.1 made sure that especially the preview window works much more stable, it introduced state persistence (so switching preview modes or module settings are memorized properly) and it brought some performance improvements.
Btw. if you didn’t check out the new workspaces features and improvements the new workspaces documentation from Susanne Moog is a good point to start.

↧

Signal / Slots in Extbase

November 30, 2011, 9:17 am

≫ Next: Tagging page caches

≪ Previous: Release reviews (templavoila, imagemap_wizard, workspaces)

A nice thing to have at hand is definately Signal and Slots. I heard Felix talking about them quite often and I finally found a nice usecase and came to play with them a little bit this afternoon. And just to avoid that others have to look around too much to find how they can get them to work here’s how it’s working for me.

First of all you should understand the concept. This nice little “definition” (from flow3.typo3.org) sums it up pretty well:

A signal, which contains event information as it makes sense in the case at hand, can be emitted (sent) by any part of the code and is received by one or more slots, which can be any function ~~in FLOW3~~ in extbase.

To get this running in extbase, you’ve to get hold of the Tx_Extbase_SignalSlot_Dispatcher, which is the central instance to manage all of it. Within Extbase that’s done easily with this snippet within your classes:

   ...
	/**
	 * @var Tx_Extbase_SignalSlot_Dispatcher
	 */
	protected $signalSlotDispatcher;
   
	/**
	 * @param Tx_Extbase_SignalSlot_Dispatcher $signalSlotDispatcher
	 */
	public function injectSignalSlotDispatcher(Tx_Extbase_SignalSlot_Dispatcher $signalSlotDispatcher) {
		$this->signalSlotDispatcher = $signalSlotDispatcher;
	}
   ...

Next thing is to make use of it. The Slot (listener) part could look like one of following blocks. In all cases you define the Signal by it’s class (not necessarily a PHP Class) and it’s name. Next to that the Slot can either be defined by a Closure, an object with a method name or a PHP-Class and a method name.

   ...
// Using a closure
$this->signalSlotDispatcher->connect(
      'Crunching', 'emitDataReady', function($data) { crunch($data) }, NULL, FALSE
);
   ...
// Using a method of the current object
$this->signalSlotDispatcher->connect(
     'Crunching', 'emitDataReady', $this, 'crunch', FALSE
);
   ...
// Using a method of the specified class
$this->signalSlotDispatcher->connect(
     'Crunching', 'emitDataReady', 'Cruncher', 'crunch', FALSE
);
   ...

To trigger the Signal which invokes the Slots registered above, you’ve to run the following code.

$this->signalSlotDispatcher->dispatch('Crunching', 'emitDataReady', array($data));

One thing I found was that by default the Tx_Extbase_SignalSlot_Dispatcher it not a Singleton in older extbase versions. Bastian fixed that already in the master and 1-4 branches and lucky enough this change was part within the TYPO3 4.6.1 release. But I think it’s still important to mention that this wasn’t the default from the beginning on.

Even if AOP is a nicer way to implement this feature, the extbase backport still works pretty straigh forward.

Edit: One thing I’ve to add – Felix is not “just” talking about Signal/Slots – he’s the one to thank for the backport. And now that his blog is running again – this post seems like a summary

↧

Tagging page caches

December 22, 2011, 9:23 am

≫ Next: Further reading 2011

≪ Previous: Signal / Slots in Extbase

In our small TYPO3 world it’s quite common to have list and single views for extensions on specific pages. That’s quite nice because once a record is changed it allowes users to flush the caches for these pages automatically, using:

TCEMAIN.clearCacheCmd=all|<pid>

But of course it isn’t very smart to clear all caches once you have filled them up for hundred or even thousand records. To work around these limits it’s quite handy to use an API which lives in TYPO3 since 2008. It allows to add tags to the page cache and removes the caches by tag.

Adding tags to the current page can be done with this block:

$GLOBALS['TSFE']->addCacheTags(array('tx_example_model:' . $id));

Removing the caches for that specific page could look like this:

$GLOBALS['typo3CacheManager']->getCache('cache_pages')->flushByTag('tx_example_model:' . $id);

Correctly integrating this with your extension is quite easy. To set the tags, add the first block into your controller and make sure you provide the proper uid for the rendered domain objects. To remove the caches once a record is saved in the backend just register a t3lib_tcemain hook and flush the tagged caches.

Credits: Fabrizio Branca @fbrnc – thx for pointing me to it

The Tcemain-Hook could be integrated with these two snippets:
EXT:example/Classes/Hooks/Tcemain.php

class Tx_Example_Hooks_Tcemain {
	public function processDatamap_afterDatabaseOperations($status, $table, $id, $fieldArray, &$reference) {
		if ($table == 'tx_example_model') {
			$GLOBALS['typo3CacheManager']->getCache('cache_pages')->flushByTag('tx_example_model:' . $id);
		}
	}
}

EXT:example/ext_localconf.php

$GLOBALS['TYPO3_CONF_VARS']['SC_OPTIONS']['t3lib/class.t3lib_tcemain.php']['processDatamapClass']['example'] = 'EXT:example/Classes/Hooks/Tcemain.php:Tx_Example_Hooks_Tcemain';

↧

TemplaVoila 1.6.x

January 15, 2012, 11:10 pm

≫ Next: Visualizing TYPO3 Core activity

≪ Previous: Further reading 2011

Quite some things changed in the past months and I never found the time to clear up my mind and write up a summary.

First of all TemplaVoila 1.6 was released parallel to TYPO3 4.6. It shipped with 22 bug and compatibility fixes. In general the 1.6.x branch is supposed to be compatible with TYPO3 4.4+ which also made it possible to clean up the extension quite a bit. In addition to that TemplaVoila 1.6.1 will show up in the TER in the next few hours. It fixes 10 additional issues.

Besides that, the main repository for TemplaVoila was moved to git.typo3.org and the old Subversion repository and my Github repository have been removed. The new repository location also enables and enforces a new way to contribute code changes for the project. Comparable to the TYPO3v4 Core every change request can be sent to Gerrit, where I can review the changes before they get merged into the repository. That workflow turns out to be very efficient for me. A detailed summary on how to contribute code to any repository hosted on git.typo3.org can be found on wiki.typo3.org. To sum up some of the steps – here’s how you submit a patch*:

git clone git://git.typo3.org/TYPO3v4/Extensions/templavoila.git
cd templavoila
scp -p -P 29418 <username>@review.typo3.org:hooks/commit-msg .git/hooks/
git checkout -b workingBranch
# ... work on the files ...
git add <changedFile>
git commit
git push origin HEAD:refs/for/master/<topic>

A further general change to the team happend more or less silently. During 2011 nobody from the old team or any new developer showed interest in the project and only few contributors helped with bugfixes or reviews. Due to that I also changed my attitude regarding future releases.
I’m currently planning to release one version parallel with every main TYPO3 4.x release, to make sure that TemplaVoila works in new versions and to make sure people can enjoy TYPO3 with TemplaVoila in the future as well. I’ll also try to keep it compatible with all “stable” releases and aim to keep the issue count within the bugtracker as low as possible.
But I’m not planning to integrate new major features such as the field content sliding (use EXT:kb_tv_cont_slide please) or a context sensitive content wizard (a.k.a “content firewall”). Also major refactorings won’t happen because they’d consume far too much of my time – therefore also the desperately needed update of the mapping module or a new new page module will not be implemented in the near future.

Nevertheless I’m open for any type of contribution and I’d be happy to review and test any patch showing up in Mantis or Gerrit – I’m just not able to spent a major part of my freetime for it.

* You have to sign the Contributor License Agreement to be able to push any change – it can be found on typo3.org

↧

Visualizing TYPO3 Core activity

January 20, 2012, 12:00 am

≫ Next: Finding TypoScript errors.

≪ Previous: TemplaVoila 1.6.x

DISCLAIMER: Don’t take the following too serious – when reading this post please keep in mind that the TYPO3 community itself consists of much more than just the Core team activity – many things take place outside of code repositories and can’t be measured anyhow. Every contribution is important and every single action has value.

Using Gerrit sometimes feels quite lonesome – you don’t really see who’s active and you don’t really get a feeling on how much is done in the TYPO3 Core at a certain point. To visualize on how active the contributors are and to show the most active members I applied a little scoring and summed up the results for the time since 2006.

Early 2006 impact chart with few early contributers - click the image to see the full chart

The scoring is quite easy – every author of a patch gets 10 points, testers get 3 and reviewers 1 point*. Looking at the stats it seems that even the statistics pulled from the old Subversion days seem to meet up with today’s numbers – of course we’ve to keep in mind that everything which was pulled from Subversion doesn’t really point to the author but to the actual committer (except if there was a “Thanks to XXXXX” reference in the commit).

To visualize the numbers I choose a Github like impact chart**. Each contributor has it’s own color and line in it and whenever he got active the width of the line scales up. To maintain the overview the scale of the width isn’t linear and every contributor who’s not within the “Top 20″ had to be scaled down to “1″. The line stops if the contributor didn’t get active anymore. The scores are grouped and compared by month.

Snapshot of the "Top 20" stats taken Jan. 14th 2012

Taken the last 3months - see the entire stats on the referenced page.

In addition to the chart I also created a table based overview for the “Top 20″ contributors with their score***.

Few things I got from the numbers. There have been 290 contributors already – simply amazing. Comparing the (huge) scores of the release managers with their fellow contributors shows once more that they can be really proud about the job they did or do. It’s also quite cool to see that many people keep sticking around. And finally it was quite surprising to see that our community manager Ben van’t Ende can also be found in the stats. Besides that it’s up to everyone else to find their conclusions from these numbers. I hope it’s motivating everyone to see that the community is active as always.

The script I created to generate the charts isn’t too nice at the moment, ~~but I promise to publish it once I cleaned it up a bit. If you can’t wait to get your hands on it, feel free to send me a mail or tweet.~~ – but I published it anyways: github.com/tolleiv/Repo-Activity-Monitor

Links in short: Impact chart, Monthly “Top 20″ tables.

* The scoring includes the commits to the Core master and all related submodule commits. Unfortunately the submodule commits don’t hold the reviewer and tester information. Also some of the latest changes made in the submodules may not show up at the moment because the submodule pointers haven’t been updated yet.
** Inspired by the incredible RaphaëlJS vector graphics library which is distributed with an MIT license and can be found on raphaeljs.com
*** Again – don’t take it too serious – the scoring doesn’t take into account that some patches can be written in 15 minutes where others take days. It also only uses the final testers and reviewers mentioned in the commit message, other contributions are not counted atm

↧

Finding TypoScript errors.

February 16, 2012, 8:02 am

≫ Next: TYPO3 login state and Varnish cache

≪ Previous: Visualizing TYPO3 Core activity

When you work on TypoScript templates in TYPO3, errors might show up in the TypoScript Object Browser. Within the error messages you’ll see a more or less detailed error description with the related line number. Within most setups these line numbers won’t relate to any of your sys_template records or TypoScript files directly. But they still provide value if you know how they help to find the right spot. As it’s not too obvious how to find the right spot I’ve created a little screenshot series to guide you to the broken spots in your templates.

So that’s what you might see in your TypoScript Object Browser:

Switching from there to the Template Analyzer:

At the bottom of the Template Analyzer, you’ll find a “Complete TS” section and a link ~which looks like normal text.

Clicking on that link will give you the entire concatenated TypoScript and here you’ll also find that the line numbers finally match to the error message.

A well hidden gem which works most likely in all TYPO3 4.x versions

Edit: In the meantime, Ingo’s patch made it through the review process. So users of TYPO3 4.7 and above will find a nice and handy “Show details” link next to the error message. Makes it much much faster to find the broken spot. Thanks Ingo

/ (@irnnr)

↧

TYPO3 login state and Varnish cache

December 10, 2012, 7:56 am

≫ Next: Re: How to move on? #TYPO3

≪ Previous: Finding TypoScript errors.

Caching is hard in complex page setups with user specific content, especially when public pages change their content once a user is logged in. TYPO3 is smart enough to deal with the login state properly and cache appropriately. Once Varnish is involved, it’s quite tricky to cache as much as possible without loosing the dynamic content. But it’s not impossible and here’s my summary how we resolved it for typo3.org.

Setup

The basic Varnish setup is more or less always the same and best described by Farbrizio Branca. On top of that we need some TypoScript parameter tweaking to get the cache-control-headers in TYPO3 straight – Daniel Pötzinger’s article covers them best. Another very handy thing which can be found in Fabrizio’s blog is the simplified flow chart for the various Varnish subroutines.Based on that all pages should be cached properly and your site should run smoothly. But in case you have a page with personalized content, you’ll have to reconsider some parts.

Problem

The event submission page on typo3.org is a good example. In case the user is not logged in (a.k.a public page), we just want to show a message which guides him to the login. If there’s a login active (a.k.a user page), we’ll show the submission form instead. In both cases we could cache the content nicely, but how would we ensure that Varnish delivers the correct content?

Solution

Ajax could be a solution, but for large sites it’s usually smarter to avoid as much JavaScript as possible. EdgeSideIncludes (ESI) are another option, but I agree with Daniel, they’re not really useful in this case and I’d rather go with Ajax than with ESI.

What we want in this scenario, is to cache the public page in Varnish and pass to the user page generated by TYPO3 if we find that the user is logged in. But this should of course only happen on pages where this is really necessary – normal pages should just ignore the login state of the user. Therefore we need sth. to distinguish normal from login specific pages in Varnish. Lucky enough TYPO3 already provides a field in the pages properties which allows this distinction. Using the Login Behaviour (pages.fe_login_mode) field, you can enable and disable the user-login for specific branches and pages*. As we want to whitelist login specific pages, our root page should have the default setting “Disable Login” – this will be inherited to all sub-pages. All the login specific pages should have the setting “Re-Enable login”.

Once this is done, we need a way to carry that out to Varnish. We improved EXT:cacheinfo for that purpose, with that it now carries a “loginAllowedInBranch” or “noLoginAllowedInBranch” value in the “X-T3CacheInfo” header. Using all that, the Varnish VCL can be extended to make use of it like this:

sub vcl_hit {
  if (obj.http.X-T3CacheInfo ~ "loginAllowedInBranch") {
    set obj.http.Cache-Control = "private";
    if (req.http.Cookie ~ "(e_typo_user|PHPSESSID|_pk_.*)") {
      # Do not cache requests which come from a logged in user
      return (pass);
    }
  }
}

This is straight forward. For every page which allows logins, we make sure that the client does not keep them in his cache. In case we’re actually on such a page and find the related login-cookies, we pass the request along to TYPO3, otherwise we deliver the public page right away from the cache. The fact that we pass the request along to TYPO3 in some cases doesn’t mean that we’ll deliver the user page, it just indicates that we’ve to rely on TYPO3 to make the right choice based on the actual login state.

Conclusion

For me the beauty here lies in the simplicity. Once you managed to wrap your head around the flow chart and once you managed to deliver appropriate meta-data to Varnish, many more complex scenarios can be resolved equally.

As most of the typo3.org stuff, this solution came from a great team. In this case Michael Stucki and Daniel Pötzinger helped to craft the final solution – thanks guys

* the naming of the field’s labels is really irritating – especially “0 – Enable login” should be “Inherit setting” as it really does not force any setting.

↧

Re: How to move on? #TYPO3

December 22, 2012, 1:11 am

≫ Next: Using genetic algorithms to optimize Apache Solr boost factors.

≪ Previous: TYPO3 login state and Varnish cache

As I wanted to answer Robert’s post but didn’t like the privacy of our Core-internal list, here’s some kind of response to it. Alongside I’ll try to explain the current situation and problems a bit***.

What happend?

In his post Robert kind of gave up his attempts to establish a TYPO3 product board [1]. The reasons for this are very wide-spread and mainly the various flame-wars in the last year and the very personal attacts brought him to the conclusion to:

* no longer invest my time into setting up or participating in an overall product team
* refrain from trying to establish leadership for the TYPO3 project
* concentrate on Flow and Neos and invite teams to participate in frequent meetings about it
* unsubscribe from the core internal mailing list

Even thought this shortens his entire mail a bit – that’s the bottom line it comes down to****.

Looking at the recording of the “Not-the-product-board” [7] meeting from last Tuesday [2] it seems that these steps aren’t necessary, as everybody seemed to be happy with the setup. So what’s the criticism actually about? Lucky enough I don’t have to describe it myself, but can cite an (again internal) response to an earlier mail from one of my fellow Core Team members:

Suddenly, when you (Robert) left the steering committee of the T3A because of the
new bodies and bylaws, you realized that there is no more power for you
to decide about things. That’s why you brought up the concept of a
product team. It’s an attempt to replace the former steering committee,
which has backed up the decisions of the core team in a nice but
undemocratic way.

In addition to these two standpoints we’ve a mixture of opinions and directions in our community and all of them cause quite some irritation [3,4,5]. With all these repeating fights and discussions Robert and (I guess) most of us ask ourselves how the community could avoid the fighting?

So what’s my opinion:

What we should avoid

Closed door meetings and private discussions – most of the “emotional outbursts” we saw where caused when the seemingly final results where presented to the community out of the blue. The fact that Core-internal is still a vivid place is a problem and we all should be abandon it.
Working without vision – the long holding support (since 2006) for Neos and the nice drive gridelements got from it’s fans [9] are two good examples that a vision can move mountains. But we should renew the vision statement and provide roadmaps on how we want to achieve it – mainly to motivate contributors and to enable collaboration between the teams.
Leaders or leader groups – other OpenSource products have their benevolent dictator – this wouldn’t work for our community. Their focus will never reach the entire community and in the end their decisions will always cause confusion. Imho Kasper made a great job to bring TYPO3 to life – but he/we saw that the “dictatorship” didn’t scale in the end [12] and imho Robert’s response and the controversy around the product board kind of shows that too.**
Personal attacks – nobody joined the TYPO3 community to fight, so there’s no reason to fight back. Within all controversy and disappointment everyone tries to improve TYPO3 in his way [10]

What we need

Diverse groups from all parts of the community – there’s no other way to capture all ideas and to gather various people from the entire community. As a nice side effect they make the “surface” of our community much larger and might help to involve new users. Btw. our community manager Ben van’t Ende will be happy to help kickstarting and coordinating groups.
Open meeting protocols and discussions – to make sure there’s a chance for everyone to catch-up later and to avoid bad surprises as we saw during the rebranding [3] or during the version schema change [6] In both cases small groups of people made major (not necessarily bad) decisions on their own behind closed doors – the following controversies showed that good intentions can turn out very negative.
Clear communication structures in the groups – to make sure things are “presented” appropriately. As see in [6] or in more recent situations [8] things would improve if groups had clear communication-channels. In [6] and [8]. seemingly official messages turned out to be personal attempts instead without actual “approval” / “consensus” from the related groups.
Rough product roadmaps – They should exist to make sure people share a common vision but they should not be too straight to make sure nobody feels too bound to it. They should also exist to enable some kind of measurement. We should be able to ask the Neos or CMS team whether they “live” along their vision or whether it would be better to adjust it.
Constructive honesty – people should discuss openly, in a respectful way [11]. Technical doubts and constructive criticism should be allowed, but it should be fair.

What does that mean for our products?

Honestly: I don’t know. Looking at the current situations with the lists from above: we have tons of groups working in many directions, we have a group that agreed on a common product-vision (two times) [13], but the group communicates behind closed doors. I’m not aware of any Neos or Flow roadmaps, but TYPO3 CMS has at least a (short) roadmap [14]. The Core team doesn’t have clear communication structures yet and as shown this raises ton’s of confusion. In addition our discussions tend to get very personal. It seems that sth. like a “product board” could help but…

Product board and leadership

On one hand I don’t like the way how the “product board” was positioned in the beginning – it should not lead anything or decide anything. On the other hand it’s great to have a group of people taking care to formulate a vision. The access to this group should be open to everyone, not just group leaders. Inner-circles, Top-10 groups, leader groups, Core-internal discussions should be avoided and open group-”setups” should be emphasized.

Hope this made sense?

Read on:*

[1] – Google Doc: Product Board (or “Product Team”)
[2] – TYPO3 Product Hangout (On Air)x
[3] – Die Marke TYPO3 erfindet sich neu
[4] – Rebranding: Get the green back
[5] – Wieviel Kommunikation und Roadmaps braucht ein Open Source Projekt?
[6] – TYPO3 6.0 at the corner? How is it possible?
[7] – @kdambekalns: Now a first #TYPO3 “product …” meeting, not official, no decisions, nothing.
[8] – @WrYBiT: @benvantende .. I expected, a mail by you and a news on T3O about it…
[9] – Startnext: Verbessertes TYPO3-Backend mit neuen Features
[10] – @tom_noice: . @thomas_hempel The discussions are there because so many people care.
[11] – Community Code of Conduct
[12] – King for a day, but not for a lifetime
[13] – The Phoenix team reports on the Developer Days 2012
[14] – Proposal for the upcoming Roadmap and LTS

* take your time to read the endless lines of comments.
** Imho Oliver Hader choose a good attempt to (not)”lead” the Core Team – more in the sense of “managing” and “enabling” without ruling
*** Some tweets and messages from the Core members might have been quite confusing without the context
**** I’d prefer not to play TYPO3-leaks here, so it’s up to Robert and the others to publish their mails by themselfs

↧

Using genetic algorithms to optimize Apache Solr boost factors.

June 6, 2013, 11:47 am

≫ Next: TemplaVoila future

≪ Previous: Re: How to move on? #TYPO3

Configuration interface.

One thing I took along from last year’s ApacheCon was the idea to combine Apache Solr along with some mathematical search algorithms to figure out boost factor values. I did some work on that back then and on the way to this year’s BerlinBuzzwords. Now I finally have a proof-of-concept working which I’d like to share. If you want to have a look right away – the code can be found on Github.

The problem to solve:

When running search indexes with Solr, one thing you might stumble opon is that you’ve various fields in your documents and you’ve to adjust their weights to get reasonable results. Finding those “boosting” values can be quite complex when you have many fields and many scenarios. Usually getting the values right is a task for very experienced integrators.

/solr/select?defType=dismax&q=my+query
&qf=title^42+description^23+footnotes^5+dalmatiners^101+foo^9001+comments

Looking at it from a more technical perspective – when your Solr query looks like the one above, the question you’ve to answer is how the values for the highlighted numbers should look like to get reasonable results.

Measuring “reasonable”:

In order to solve the problem, the first thing we’ve to do, is to answer what we expect the outcome to look like. In other words, we’ve to measure how reasonable a specific solution is. For a search engine this can be done with some sample queries and some expectations along with that. The expectation could come in a form that we explicitly tell which documents we expect in the result lists of specific queries (and at predefined positions). Once we’ve these expectations, we can simple test agains the expectations and check whether or not specific boost factor values actually satisfy them.

A small example on that: In case we’ve a sample query with the expectation that document 123 appears in the first position and document 248 appears second. We could run this with two specific boost factor combinations (a) and (b). Along with (a) we might find that, document 123 actually ranks on position 8 and document 248 is found on position 4 and with (b) we’d find them on pos. 2 and pos. 14 – which one would we consider to be better?
Comparing the “error” and “squared error” produced by (a) and (b) gives us a possible hint:
(a): 8-1 + 4-2 = 7+2 = 9
(8-1)² + (4-2)² = 49+4 = 53
(b): 2-1 + 14-2 = 1+12 = 13
(2-1)² + (14-2)² = 1+144 = 145
While it’s not clear to compare both with just the normal error value comparision, the squared error shows clearly that (a) seems to outperform (b) in those cases and we should choose (a) for further considerations.

Being able to determine the “error” introduced by a specific solution then enables us to compare various solutions and helps us to play around with all sorts of optimizations.

The idea:

With a defined “cost function” like the one I introduced before, you’d be able to tackle the problem with some well known algorithmic solutions. Considering the boost factors to be represented as numerical vectors, we could use gradient methodologies to find good solutions. But having 20-40 fields per document would require to “search” a large numerical space and with gradient methods, this would result in a large amount of queries.

Another approach to run these optimizations, is to utilize genetic algorithms which kind of help to find good solutions within predictable amounts of time. You might know genetic algorithms for some lectures where people solved traveling salesman problems and actually the only change you’d have to make is to exchange the traveling salesman cost function with the cost function you saw before and you’d be close to a solution already.

With some more details: Genetic algorithms take an amount of randomly generated possible solutions (called the population) and try to find good solutions by applying the typical methods you know from your biology class (mutations, crossovers, natural selection). Natural selection is done in a way that from each generation only the top 50% “survive and the rest of the population is filled up with now solutions generated through mutations (random parameter changes of existing solutions) and crossovers (interchanging parts of two existing solutions to create a third one). All solutions are always measured and compared on their response to the defined cost function and this way we’re always able to determine the “best known solution” even after very short time.

If that sounds too high-level. For the shown query from above, the vector [42,23,5,101,9001,1] is the vector I used. In addition let’s considering we have another vector [1,1,1,1,1,1] with equal weights for all fields. Assuming those are our fittest vectors at a given time, we could derive new possible solutions by mutating them (e.g. [42,23,5,101,9001,1] ~> [42,23,5,101,505,1] ) or creating a cross-over between both ( [42,23,5,101,9001,1] & [1,1,1,1,1,1] ~> [42,23,5,1,1,1]). Even adding new random vectors to our population might add some value. Once we found enough new vectors to have a population of a decent size, we’d compare the fitness and keep only the top 50% and continue our process until we reach convergence or a fixed iteration limit.

A drawback of the genetic algorithm is that it might not deliver the optimal result, because it never found it. But that’s just how nature works too. So it’s more that you’ve to sacrifice “training runtime” over accuracy or vice versa.

Implementation:

10 generation optimization

There’s really not too much to say other than that the code can be found in Github. I used NodeJS with ExpressJs, SocketIO and an Twitter Bootstrap interface to have a relatively good looking and somewhat performing proof-of-concept. I used that setup, because NodeJS seems to me as the most easiest way to talk to Solr and it “promises” to be performant even with larger examples. SocketIO helped a lot to ease the pain when it comes to Server <> Client communication. The only drawback of that setup is the that everything had to be turned into something which is able to deal with asynchronous processing. This makes the algorithmic parts look a bit odd and bloated – but for me the benefits outweigh the odds.

Final thoughts:

The proof-of-concept, which you’ll find on the Github repository, demonstrates that such type of optimization can work and that’s more or less all I wanted to do with it.

You can use the NodeJS tool with any of your Solr indexes and just go ahead and try it yourself. There are many parts which aren’t too accurate yet, especially the measuring could maybe done better with precision and recall measurements – but I assume that any type of cost function would work for now, that’s why P/R wasn’t implemented along with the tool. Also I’m not a NodeJS expert and the code might not follow best practice atm. – I’d be very happy to change that if anyone is interested to help?

When I did a small presentation during the Berlin Buzzwords bar camp I also got some other questions which don’t necessarily relate to just this implementation but to all sorts of automated optimizations.

The first question was, how to get the list of example queries and the “expected” documents for them. For now I assume that most applications at least know their top 50 or top 100 search and they should be able to predefine “relevant” documents for those searches. That’s at least what I assume everyone should have. Another way to generate the test data is to do some log file analyses and check the search and pick/weight the documents people clicked from within the results. This should also help to get some results.

Another questions related to that was wether long tail would fall behind with that approach. As this is only a proof-of-concept, I wasn’t really able to answer this. But I assume that long-tail searches would still benefit a lot more from the relevance certain documents gain due to high TF-IDF scores and those should then outweigh the “scoring bias” in a way. Another approach (known from machine learning) could be to leave out the top 1% of the documents (and searches) and just optimize for the rest of the top X% and afterwards check wether the top 1% still performs good – this way long tail could be “protected a bit more.

And the last question was whether I tested other (gradient based) algorithms already. The answer was and is, no. So far this only ran on my MacBook and I really didn’t want to benchmark my CPU. The code itself is somewhat prepared to take other optimization methods but I didn’t add in others. If you’re interested to do so – I’d be happy to accept your pull-requests.

↧

TemplaVoila future

June 8, 2013, 12:20 am

≫ Next: TemplaVoila – followup

≪ Previous: Using genetic algorithms to optimize Apache Solr boost factors.

If you followed some of my comments in the TYPO3 newsgroups recently, you’ve heard that I’m not very satisfied with the TYPO3 project in general and that’s also reflected in my activity for TemplaVoila. After a certain time of inactivity I even had to ask myself whether it’s wort to keep it in the TYPO3 universe or not. Due to the fact that this isn’t an easy decision, I created a pro and con list which I’d like to share, before I make conclusions.

Why TemplaVoila maintenance should continue:

it made TYPO3 attractive for many less technical people (people who don’t even understand conditions or loops)
it contains and combines concepts (language, workspaces, content structuring) which aren’t represented anyhow in other solutions
it is still used within the community and various indicators proof that it is still very popular

Why TemplaVoila should not be maintained anymore:

it is not supported by the active contributors at all
it is constantly under some kind of PR-attack from the other solutions (which is very demotivating)
it lacks a developer “community” or at least a team
it has a horribly outdated documentation which has to be overworked
code refactoring is not really possible, the code is horrible to maintain
it’s concepts can’t be ported anyhow to FLOW/extbase (extbase itself is broken when it comes to workspaces or languages – no way to port over alternative concepts for these)
UI wise, Prototype and ExtJS have been used for it and need to be replaced with whatever the TYPO3 Core could offer
some of it’s concepts need to be reworked (language) to be much more useable
the TYPO3 Core changes in a way that extension maintenance is no fun at all

Conclusions:

I could add further points to both lists, but in general you’ll get my point. All these have been on my mind for quite some time and I discussed them with various members of the TYPO3 community and I came to the conclusion that TemplaVoila should at least disappear from the TER to avoid that any new users start using it.

Along with that, I also came to the conclusion that handing over TemplaVoila back to Dmitry, Robert or Kasper wouldn’t make sense either – the remaining workload is enormous and a single developer alone would never be able to deliver anything with reasonable quality (incl. documentation).

This basically means that TemplaVoila won’t be actively distributed and supported by me and that there won’t be any new public releases. In order to keep up the ability to fix bugs, I’d offer to keep Forge+Git+Gerrit open and I’m still willing to review and merge patches (through Gerrit). Even though I didn’t see too much activity from others for TemplaVoila within Gerrit, I assume that this should be enough to support running projects.

To avoid that the discussion which might be needed for that announcement ends up in my blog, I’ll close the comments here and I’d love to invite you to comment within the related newsgroup entry in typo3.project.templavoila.

↧

TemplaVoila – followup

June 14, 2013, 2:45 am

≫ Next: Peter Sunde talking about flattr

≪ Previous: TemplaVoila future

It seems that publishing the news regarding TemplaVoila came to the right time and it found its way through the community. But along with it, some irritation came up which needs some additional words.

Will there be a TYPO3 CMS 6.2 compatible version of TemplaVoila?

As it seems: yes – some members of the community offered their help. I’ll reach out to them in the next days and try to connect them as good as I can. I will of course support them as good as possible. I can’t really tell whether there will be a TER release for that or whether we’ll keep it in the Git repository only. That’s up for discussion – I made my point about that already and find it still valid. In case you’re interested to help (coding, documentation, anything) – just sent me a note.

Was it a black day for the community?

I doubt that – a single tweet mentioning people clapping on the news made a large wave through the newsgroups, nothing else. Most likely this was just a misunderstanding and most importantly it didn’t reflect the generally very positive feedback I got through all other canals.

What about #T3CS13?

That’s first of all a great community event and everyone should reach out for #T3CS14 tickets next year. Nevertheless as it seems some people (still) clapped their hands when they heard the TemplaVoila news, which seemed very inappropriate. Due to that Jochen Weiland (one of the organizers) sent a personal apology to me and an “official tweet” for what happened at the #T3CS13. From my perspective this wasn’t even needed, but it’s nice that he did it anyways. The fact that Peter Pröll (proof) made a very clear statement during the event about some peoples misbehavior was already the right move. So let’s just close this chapter.

What are the alternatives, now?

We’ll see, but I hope that 6.2 or 6.3 will come along with a very strong suggestion in that regard and maybe a “Core candidate” extension/solution. Just to avoid that a double or triple-solution situation starts to grow once again.

Finally thanks for all the warm words and wishes. This made me very happy. Also please keep in mind that I just maintained TemplaVoila during the last 3-4 years and that I took over from Dmitry, Robert and Kasper.

↧

Peter Sunde talking about flattr

August 2, 2010, 8:11 am

≫ Next: TemplaVoila on github.com

≪ Previous: TemplaVoila – followup

Markus Beckedahl’s interview with Peter Sunde brought up some interesting insights into the system of flattr.com and the idea behind it. So if you’re interested in flattr.com you should listen the podcast on netzpolitik.org.

My favorite quote:“In the end stupid stuff founds good stuff” P. Sunde

Another (shorter) interview taken during the Hacknight in Malmö can be found on nrli.tv

↧

TemplaVoila on github.com

September 3, 2010, 8:52 am

≫ Next: TemplaVoila 1.5 released

≪ Previous: Peter Sunde talking about flattr

Update: Finally TemplaVoila moved to git.typo3.org – please read the update here

As it seems, souce code version control for TYPO3 will be done with Git in the future. The Phoenix team already uses git.typo3.org, there’s also already the possibility to get the latest updates for the v4 Core via github.com and it won’t take long until git.typo3.org is been used for version 4 as well.

Due to the fact that my TemplaVoila development workflow is also already git based, I thought it might be interesting for some contributors to develop with a git repository upstream. Therefore I started to maintain a TemplaVoila repository on github.com [1]. The Subversion repository on forge.typo3.org [2] is of course still the master, but both repositories are kept in sync automatically.

So once you think about sending an RFC to the typo3.team.templavoila list, feel free to attach a git based patch-file.

[1] http://github.com/tolleiv/TemplaVoila [2] https://svn.typo3.org/TYPO3v4/Extensions/templavoila/

↧

TemplaVoila 1.5 released

October 3, 2010, 1:31 pm

≫ Next: TYPO3 4.5 beta1

≪ Previous: TemplaVoila on github.com

The new version comes with many bugfixes,new features and a closer TYPO3 integration. Overall 95 issues have been resolved in the last 4 months to finalize this versions, some of the highlights are:

HTML5 support

The full list of HTML5 tags is now supported in TemplaVoila. The restrictions to specific tags was removed and the TYPO3 integrator is now able to use the full bandwidth of modern HTML. With this change also the tag-icons themself were replaced and the coloring schema was changed. The inspiration for the current color schema came from Josh Duck’s “Periodic Table of the Elements”. In additon same mapping bugs have been resolved too – for details see #13974 and #14881.

TYPO3 4.4 Look&Feel and docHeader integration

TYPO3 4.4 introduced a new skin and changed the look&feel in the backend radically. Once installed in 4.4 TemplaVoila 1.5 adjusts it’s look and provides the same usability improvements as the official page module. The page-module was optimized to use as much “official” CSS as possible to support designes with their own backend-skins. In addition to the CSS&Markup changes, TemplaVoila also uses the TYPO3 4.4 SpriteIcon API to provide and retrieve backend icons and uses the FlashMessage API to style all backend notifications.

Another important step was the integration of the so called docHeader. This is the area at the very top of each backend module page which provides useful tools and action-icons. With this version TemplaVoila finally provides docHeaders within every backend-part.

Improved TYPO3 integration

Besides the visual changes the general TYPO3 integration has been improved with various modifications.

With the current version there’s no need to give “Edit page” rights to you editors if they want to add or remove content elements. Just the “Edit content” right and access to the “Page>Content” field is enough for them. For details see: #3903

The “advached header link inclusion” is one of the integration steps in the frontend. All resources which are related to an FCE are passed through an TYPO3 API (pageRenderer). This avoids duplicate inclusion of one resource (e.g. CSS files) and enables further post-processing (e.g. compression or merging). It can be enabled using the “advancedHeaderInclusion” within your TypoScript setup which could then look like this:

page = PAGE
page.typeNum = 0
page.10 = USER
page.10.userFunc = tx_templavoila_pi1->main_page
page.10.advancedHeaderInclusion= 1

The full list of changes within this version can be found on bugs.typo3.org. Many many thanks to all contributors and reviewers – it’s great that more people try to help out and it keeps me motivated to continue improving this great TYPO3 extension.

↧

TYPO3 4.5 beta1

November 18, 2010, 1:48 pm

≫ Next: Release reviews (templavoila, imagemap_wizard, workspaces)

≪ Previous: TemplaVoila 1.5 released

As announced on news.typo3.org yesterday the TYPO3 Core team published TYPO3 4.5 beta. With this beta the features for the next version are fixed and the remaining time until the release will be used for polishing and bug fixing. A complete list of the included features and changes can be found on forge.typo3.org.

↧

Release reviews (templavoila, imagemap_wizard, workspaces)

February 24, 2011, 1:17 pm

≫ Next: Signal / Slots in Extbase

≪ Previous: TYPO3 4.5 beta1

TemplaVoila 1.5.4 The current release focussed on 4.5 compatibility. It uses the new sys_language flag “format” to support sprites, it hooks into the new backend-form (TCA) “layout” and add’s it’s fields to the right tabs within the backend forms and adjusts everything to work fine with the new CSRF mechanism.

Besides that bug fixes for the section index, performance improvements and a couple more are included.

Imagemap_wizard 0.6.0 The last versions proved to be very stable and with some additional sponsoring I was able to improve the DAM and TYPO3 workspaces support. Besides that a couple of issues which showed up in 4.4 and 4.5 were fixed. One of the next features will hopefully be a useful point of interest implementation – keep your fingers crossed that someone’s clients want’s to sponsor some time for that ;)

Workspaces 4.5.1 / 4.5.2 Even if it’s shipped with the Core and included in the official release notes, here’s my summary of the improvements. The workspace module itself brought us lot’s of good feedback and also the new workspace preview raised some attention. Even though 4.5.0 was quite stable we weren’t able to get it working perfectly. The fixes made for 4.5.1 made sure that especially the preview window works much more stable, it introduced state persistence (so switching preview modes or module settings are memorized properly) and it brought some performance improvements. Btw. if you didn’t check out the new workspaces features and improvements the new workspaces documentation from Susanne Moog is a good point to start.

↧

Signal / Slots in Extbase

November 30, 2011, 9:17 am

≫ Next: Tagging page caches

≪ Previous: Release reviews (templavoila, imagemap_wizard, workspaces)

First of all you should understand the concept. This nice little “definition” (from flow3.typo3.org) sums it up pretty well:

A signal, which contains event information as it makes sense in the case at hand, can be emitted (sent) by any part of the code and is received by one or more slots, which can be any function ~~in FLOW3~~ in extbase.

   ...
    /**
     * @var Tx_Extbase_SignalSlot_Dispatcher
     */
    protected $signalSlotDispatcher;

    /**
     * @param Tx_Extbase_SignalSlot_Dispatcher $signalSlotDispatcher
     */
    public function injectSignalSlotDispatcher(Tx_Extbase_SignalSlot_Dispatcher $signalSlotDispatcher) {
        $this->signalSlotDispatcher = $signalSlotDispatcher;
    }
   ...

   ...
// Using a closure
$this->signalSlotDispatcher->connect(
      'Crunching', 'emitDataReady', function($data) { crunch($data) }, NULL, FALSE
);
   ...
// Using a method of the current object
$this->signalSlotDispatcher->connect(
     'Crunching', 'emitDataReady', $this, 'crunch', FALSE
);
   ...
// Using a method of the specified class
$this->signalSlotDispatcher->connect(
     'Crunching', 'emitDataReady', 'Cruncher', 'crunch', FALSE
);
   ...

To trigger the Signal which invokes the Slots registered above, you’ve to run the following code.

$this->signalSlotDispatcher->dispatch('Crunching', 'emitDataReady', array($data));

Even if AOP is a nicer way to implement this feature, the extbase backport still works pretty straigh forward.

↧

Tagging page caches

December 22, 2011, 9:23 am

≫ Next: Further reading 2011

≪ Previous: Signal / Slots in Extbase

TCEMAIN.clearCacheCmd=all|<pid>

Adding tags to the current page can be done with this block:

$GLOBALS['TSFE']->addCacheTags(array('tx_example_model:' . $id));

Removing the caches for that specific page could look like this:

$GLOBALS['typo3CacheManager']->getCache('cache_pages')->flushByTag('tx_example_model:' . $id);

Credits: Fabrizio Branca @fbrnc – thx for pointing me to it ;)

The Tcemain-Hook could be integrated with these two snippets: EXT:example/Classes/Hooks/Tcemain.php

class Tx_Example_Hooks_Tcemain {
    public function processDatamap_afterDatabaseOperations($status, $table, $id, $fieldArray, &$reference) {
        if ($table == 'tx_example_model') {
            $GLOBALS['typo3CacheManager']->getCache('cache_pages')->flushByTag('tx_example_model:' . $id);
        }
    }
}

EXT:example/ext_localconf.php

$GLOBALS['TYPO3_CONF_VARS']['SC_OPTIONS']['t3lib/class.t3lib_tcemain.php']['processDatamapClass']['example'] = 'EXT:example/Classes/Hooks/Tcemain.php:Tx_Example_Hooks_Tcemain';

↧